Journal of Applied Psychology 


Edited by Donald G. Paterson, University of Minnesota 


Consulting Editors 


George K. Bennett, Psychological Corporation 
Harold E. Burtt, Ohio State University 
Allen L. Edwards, University of Washington 
Clifford E. Jurgensen, Minneapolis Gas Co. 
Irving Lorge, T. C. Columbia University 
Quinn McNemar, Stanford University 
Alexander Mintz, City College of New York 


James P. Porter, Claverack, New York 
Harold F. Rothe, Fairbanks, Morse and Co., 
Beloit, Wis. 
Julian B. Rotter, Ohio State University 
Edward K. Strong, Jr., Stanford University 
Donald E. Super, 7. C. Columbia University 
Morris S. Viteles, University of Pennsylvania 
Alfred C. Welch, Knox-Reeves, Minneapolis 





Table of Contents 


Personality Test Scores in the Management Hierarchy: H. D. Meyer and G. L. Pressel 
Temperament Measures in Industrial Selection: F. Herzberg 

A Validation Study of the Worthington Personal History Blank: J. G. Clark and W. A. Owens. 85 
A i ey Study of Supervisory and Group Behavior: R. C. Wilson, W. S. High, 


P. Beem, and A. L. Comrey 


The Check List as a Criterion of Proficiency: A. 1. Siegel 
Identification and Prediction of Two Training Criterion Factors: W. R. Graham 
Rater and Technique Contamination in Criterion Ratings: G. H. Falk and A. G. Bayroff...... 


Validity versus Reliability: E. K. Strong, Jr 


Sampling Problems in Studies of Writing Style: R. D. Powers 

Differential Prediction of Academic Success at Brigham Young University: J. B. Stone 

Performance of College Students on a Mechanical Knowledge Test: B. Balinsky and C. Hujsa.. 111 
Relation of Scholastic Aptitude to Socioeconomic Status and to a Rural-to-Urban Continuum: 


F. Washburne and D. C. Andrew 


Further Results on Group Manual Dexterity in Men: A. L. Comrey and G. Deskin 
Effects of Fatigue and Anxiety on Certain Psychomotor and Visual Functions: S. Ross, T..A. 


Hussman, and T. G. Andrews 


Dimensional Analysis of Motion: VII. Extent and Direction of Manipulative Movements as 
Factors in Defining Motions: S. J. Harris and K. U. Smith 


Discussion of Gilliland and Newman's ‘The Humm-Wadsworth Temperament Scale as an Indi- 
cator of the ‘Problem’ Employee’: D.G. Humm and K. A. Humm 


Applied Psychology in Action: 
Comment on Word Meaning; F. L. Wells 


A Note on “The Non-Directive Approach in Advertising Appeals’: 
The Measurement of Academic Freedom: W. Kerr........ 


Book Reviews 


New Books, Monographs, and Pamphlets... 0... cc cece cece eee ueeuees 





American Psychological Association 


Vol. 38, No. 2 


April, 1954 





Journal of Applied Psychology 


Published Bi-monthly by the American Psychological Association, Inc. 
Prince and Lemon Sts., Lancaster, Pa. 


Annual subscription, $7.00; single copies, $1.50 


Subscriptions and business communications should be sent to 
American Psychological Association 
1333 Sixteenth Street N.W. 
Washington 6, D. C. 


Articles for publication should be sent to the Editor-elect 


Dr. John G. Darley, Graduate School, University of Minnesota, 
Minneapolis 14, Minnesota. 





This journal gives prompt consideration to 
manuscripts reporting original investigations in 
any field of applied psychology except clinical 
and consulting psychology. A descriptive or 
theoretical article is occasionally accepted if it 
deals in a distinctive manner with a problem of 
applied psychology. The policy is, however, to 
favor papers dealing with quantitative investi- 
gations of direct value to psychologists working 
in the following fields: Vocational diagnosis and 
occupational guidance; educational diagnosi. 
prediction and guidance at the secondary school 
level and higher; personnel selection, training, 
placement, transfer and promotion in business, 
industry and government service including the 
armed forces; supervisory training in business, 
industry and government; bio-mechanics or de- 
sign of machines to fit the human operator; i!- 
lumination, ventilation and fatigue in industry; 
job analysis, description, classification and eval- 
uation; measurement of morale of executives, 
supervisors, or employees; surveys of opinion on 
social or political issues, such as those conducted 
by The Psychological Corporation ; psychological 
problems in market research and in advertising. 


Articles may be under 500 words. The maxi- 
mum is 12,000 words, the average in the 


neighborhood of 4,000 words. To reduce lag of 
publication, adherence to the rule of “brevity 
consistent with clarity” is encouraged. 


A lapse of six to twelve months occurs between 
acceptance of an article and its publication, the 
lag varying with the rate at which manuscripts 
are submitted. If, however, an author is pre- 
pared to defray the costs of printing the neces- 
sary extra pages, he may arrange for earlier 
publication without thereby postponing the ap- 
pearance of manuscripts by other contributors. 
This enables the management to provide space in 
addition to the scheduled 64 pages per issue. 
“Early publication” is thus a direct contribution 
to the subscribers. By cutting down lag in pub- 
lication, it also benefits those authors whose 
articles are published in regular turn. 


Tables, footnotes and references as well as 
text of manuscripts should be typed double-spaced 
throughout. Authors should adhere to the con- 
ventions described in the “Publication Manual 
of the American Psychological Association,” 
Psychol. Bull., 1952, 49, No. 4, Part 2. A copy 
of the Manual will be loaned to any prospective 
contributor who does not find it in his library. 


Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the act of March 3, 1879 


Acceptance for mailing at the 


special rate of postage provided for S Semeneh 660, Sate 9600, 


P. L. & R. of 1948, authorized October 10, 194 
Copyright 1954 by the American Psychological Association, Inc. 





Journal of Applied Psychology 








VoL. 38, No. 2 


APRIL, 1954 








Personality Test Scores in the Management Hierarchy 


Henry D. Meyer and Glenn L. Pressel 


Stevenson, Jordan & Harrison, Inc., Chicago, Illinois 


The primary purpose of this study was to 
obtain, if possible, an indirect but compre- 
hensive industrial validation of the paper and 
pencil personality test developed by Steven- 
son, Jordan & Harrison psychologists for use 
as an interview aid in their work of apprais- 
ing candidates and incumbents for manage- 
ment positions in industry. The basic con- 
cept implicit in the development of the test 
was that certain personality traits become in- 
creasingly desirable in incumbents and ap- 
plicants as the positions bear increasing re- 
sponsibility and relative status in the man- 
agement hierarchy.' The most obvious test 
of validity of these traits would be to deter- 
mine whether or not the people holding po- 
sitions at different management hierarchy lev- 
els show differences in these same test traits 
and whether the differences show constant in- 
crements as the hierarchy levels increase. 

Such a validation study does not make any 
discrimination between the competent and in- 
competent person at any given level. Rather, 
the assumption is made that some complex 
selective survival and elimination process is 
operating because consistently fewer persons 
achieve successively higher levels in the man- 
agement hierarchy. If the personality traits 
tested are pertinent to such selectivity, that 
fact should be apparent in the distribution of 
trait scores at the various hierarchy levels. 

This type of “selection” criterion was much 
more acceptable to the present authors than a 
criterion based on some authoritative group’s 
judgments of the managerial competence of 
executives or managers. Not only did the 
former type of criterion eliminate the neces- 
sity of getting agreement of judges as to what 


' The word “management” is used broadly here to 
characterize all positions above the hourly rate level 
from foreman, engineer, salesman, or accountant, up 
to president 


73 


is managerial competence and how it is ob- 
served, but also it allowed the study to pro- 
ceed around the design of a statistical analy- 
sis of previously obtained data from S. J. & 
H. files. The execution of the study there- 
fore became a formal test of the hypothesis 
that there are trends in personality trait test 
scores as one proceeds from lower to higher 
level positions in the management hierarchy. 


Selection Procedures 


The industriat-management hierarchy was di- 
vided into five job levels with officers and gen- 
eral managers at the highest level and hourly rate 
workers at the lowest level. A total of 100 cases 
for cach level except at the top* were selected 
from S. J. & H. personnel evaluation test files. 
Each case had been given, at the original time of 
testing, the improved Form B of the Employee 
Questionnaire, the S. J. & H. personality test. 

The Personality Test. The Employee Ques- 
tionnaire, known as the E. Q. Test, is a brief 
industrial personality test developed and de- 
scribed in the literature by previous S. J. & H 
psychologists headed by H. F. Rothe (1, 3, 4) 
and subsequently improved by increasing the 
number of items from 50 to 75 and modifying 
the trait scoring keys according to the results of 
an item analysis. The tests were scored on 
seven trait keys with 8 to 12 items in each trait 
key and with simple scoring of items without 
weighting. These traits were objectivity, social 
dominance, drive, detail, emotionality, extraver- 
sion (sociability), and (poor) adjustment. Rothe 
(4) has discussed the definition of all of these 
trait terms except detail, which may be defined 
as the liking for detail in work, thought, and 
recreation; the desire to personally take care of 
all the details of projects in which one is in- 
volved. For each trait the mean score and the 
standard deviation for each of the five _hier- 
archy levels was computed 

No claim is made that each trait 
unitary factor 


1s 


a pure, 
Rather, the definitions describe 


2 Only 57 cases were available from July 1949 to 
February 1952 which filled the requirements of being 
in the top category and taking the improved Em 
ployee Questionnaire, Form B personality test 





74 Henry D. Meyer and Glenn L. Pressel 


traits that are felt to be relevant to management 
success. The major intercorrelations among traits 
for 161 cases where objectivity is held constant 
at a median score are as follows: social dominance 
and extraversion, r = .78; adjustment and emo- 
tionality, r= .67; detail and emotionality, r= 
.66; adjustment and detail, r= .52; detail and 
drive, r = .46; adjustment and social dominance, 
r = — .32; emotionality and drive, r= .28; ad- 
justment and extraversion, r= — .26; and extra- 
version and drive, r = .25. 

A factor analysis * of the same data, i.e., with 
objectivity held constant, reveals two major clus- 
ters and one minor cluster. The first cluster is 
social dominance and extraversion; the second, 
detail, emotionality, and adjustment; and the 
minor one, drive. 

Categorizing the Hierarchy. On the basis of 
the senior author’s experience in consulting with 
industrial concerns at all levels of management, 
the hierarchy was broken down into five grades 
of job status according to job titles as follows: 

I. President, Vice President, Treasurer, Gen- 
eral Manager, General Sales Manager and Ex- 
ecutive Engineer. 

Il. Works or Plant Manager, Sales Manager, 
Chief Engineer, Chief Industrial Engineer, Con- 
troller, Industrial Relations Director, Purchasing 
Director. 

Ill. Production Superintendents, Industrial 
Salesmen, Sales Engineers, Department or Sec- 
tion Heads in Accounting, Industrial Engineer- 
ing, Design Engineering, Inside Sales, Purchas- 
ing and Personnel. 

IV. Production Foremen, Accountants, Design 
and Process Engineers, Time Study and Produc- 
tion Control Men, Sales Correspondents, Jr. In- 
dustrial Salesmen, Personnel Men. 

V. Clerks and Factory Workers. 

This breakdown of job status into hierarchy 
levels was reviewed and accepted by a supervis- 
ing engineer of the S. J. & H. engineering staff 
as a rough approximation to the general trend in 
manufacturing industry insofar as one existed 
This breakdown was used as a guide in sorting 
the actual cases into five hierarchy groups. 
While it was recognized that job titles are no 
guarantee of specific job content nor always a 
true reflector of the actual status of the job in 
the management, it was felt that such a break- 
down came as close as was possible to a general 
criterion of management status. In any event, 
the selection of the cases followed procedures 
more elaborate than merely reading a job title 
as will be shown in the next section. These job 
titles were the convenient way of expressing dis- 
tinctions in status that derived from the senior 
author’s consulting experience and cannot be de- 
fended any further than that. 

Selection and Placement of the Cases. In the 
selection of the cases, the major attempt was to 
obtain purity of hierarchical and occupational 


8 The authors are indebted to Vernon Keenan for 
making this factor analysis. 


classification with as wide a variation in occupa- 
tion and company affiliation as permitted by the 
case history file. In studying cases for place- 
ment, the man’s whole work history was re- 
viewed. The criteria for selection and_place- 
ment in a hierarchy were: 

1, That the person be now, or last, employed 
at a job clearly recognized as belonging to a spe- 
ific grade in the hierarchy or that his employ- 
ment record indicate that he had consistently or 
steadily held such a job or jobs in the past. 

2. That his employment record indicate a con- 
sistency of occupation and that he had proceeded 
through a normal job succession up to the job 
according to which he was classified. 

3. That if the occasion for the testing was to 
apply for a position, the job applied for be at 
the hierarchy level indicated by his previous em- 
ployment. a iY - 

in reviewing the cases, “the senior author fre- 
quently was able to bring a personal knowledge 
of company size and organization structure to 
bear upon the information provided in the job 
history. Also, since all of the cases had to have 
been tested since July of 1949, to have taken the 
improved E,. Q. Form B personality test, the 
senior author had interviewed the majority of 
the cases himself regarding their job duties and 
histories. As a result of this knowledge of com- 
pany and job, a more consistent selection of 
typical cases for each grade was obtained than 
could have been obtained from job titles alone. 

No attention was paid to test results in select- 
ing cases, nor was any consideration given to 
whether the person was appraised as superior, 
average, or inferior. In fact, many of the cases 
had to have their personality tests rescored on 
the improved key* after they had been chosen 
for the study. 

The attempt to secure 100 cases for each of 
the five categories resulted in some stretching of 
the criteria in categories I and II of officers and 
second level executives where for the last few 
cases some men were chosen where the previous 
employers were not well known and where the job 
applied for was lower than the category in which 
the man was placed by reason of previous em- 
ployment.® 

Also it should be noted that there was no at- 
tempt to control company size or to modify the 
status of a job according to the size of the com- 
pany. Companies of all sizes are listed in the 
employment histories of the cases selected. How- 


* The scoring key was revised following the item 
analysis 

®* A detailed summary of the present or most re- 
cent employment of all cases chosen for each hier- 
archy category has been deposited with the Ameri- 
can Documentation Institute. Order Document No. 
4191 from the ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress, Wash- 
ington 25, D. C., remitting in advance $1.25 for 35 
mm. microfilm or $1.25 for 6 X 8 in. photocopies. 
Make checks payable to Chief, Photoduplication 
Service, Library of Congress. 





Personality Test in Management Hierarchy 


Table 1 


I. Q. Test Trait Score Means and Sigmas 
According to Hierarchy Level 


Total 
Group I 


E. Q. Trait M S.D. M S.D. 
Objectivity 44 1.9 43 19 
Social Dominance** 6.7 2.2 7.2 2.0 
Extraversion 3.9 18 3.7 20 
Drive 59 18 5.6 1.7 
Detail** 43 1.9 $2 i6 
Emotionality** 4.0 2.3 3.0 19 
Adjustment (poor )** 36 2.1 28 1.8 


*N = 100 for levels II, III, IV, and V. 


Hierarchy Level* 
II 
M S.D. 


2.0 

a 24 
3.9 1.7 
5.8 19 
3.9 19 
o4 23 
3.2 18 


N = 57 for level I. 


** Significant at 5% level for single classification F test. 


ever, most of the companies in which the persons 
were presently employed or seeking employment 
were medium-sized companies, or medium-sized 
plants of large companies. Typical employment 
would be between 500 and 1,000 for the smaller 
plants in the group and between 2,000 to 3,000 
for the larger plants. 

Statistical Procedures. The statistical pro- 
cedures of the present study were based on the 
results of a pilot study of 200 cases chosen at 
random. The pilot study was a preliminary test 
of the hypothesis of a relationship between per- 
sonality trait scores and management hierarchy 
levels in order to determine whether or not the 
hypothesis had sufficient merit to warrant a full 
scale major study. 

The results of the pilot study indicated that 
trends for the E. Q. test traits by hierarchy level 
probably existed for the social dominance, de- 
tail, and emotionality traits and that there might 
well be continuous increment trends. An analy- 
sis of the differences between the means within 
the traits showing trends indicated that it would 
be necessary to have 100 cases at each of the 
compared hierarchy levels in order that mean 
differences of the magnitude observed be sta- 
tistically significant. Hence, the major study 
was done with 100 cases in each hierarchy level 
except level I, for which only 57 cases were cur- 
rently available. All subsequent results are from 
the major study. 


Results 


The Test for Trait Score Trend Validity. 
A single classification F test for hierarchy 
alone was used to determine the validity of 
the hierarchy trend for each trait and a ¢ 
test was used to establish the validity of 
differences between trait score means at the 
extremes of the hierarchy for each trait. 


These tests demonstrated significant hier- 
archy trends for social dominance, detail, 
emotionality, and adjustment, but failed to 
reveal significant hierarchy trends for objec- 
tivity, extraversion, and drive. Table 1 
shows the trends in trait score means by hier- 
archy level for the major study. 

The Test for the Independence of Hier- 
archy Trends. The establishment of hier- 
archy trends in several trait scores did not 
prove that hierarchy produced them inde- 
pendently of other variables. There could be 
other variables producing trait score trends 
which are associated with differences in hier- 
archy level. Age and education are certainly 
associated with differences in hierarchy level 
and might also produce trait score trends. 
Selective sampling by occupation might have 
occurred such that the hierarchical trends ob- 
served might have been due to trait score dif- 
ferences related to occupations rather than to 
hierarchy. Also, the present study and previ- 
ous studies (4) indicated that differences in 
objectivity trait scores were known to pro- 
duce differences in the four trait scores hav- 
ing hierarchy trends to as great an extent as 
hierarchy levels, particularly in the emotion- 
ality and adjustment traits. Fortunately, the 
objectivity score means were approximately 
the same for all five hierarchy levels in the 
present study. 

The first step in testing the independence 
of hierarchy as the trend producing variable 
was the analysis of the data to see if there 





Henry D. Meyer and Glenn L. Pressel 


Table 2 


I. Q. Test Trait Score Means and Sigmas 
According to Age Level 
Age Level 


Years 
20-30 
N = 87 


Years 
30-40 
N = 171 


M S.D. M 


E. Q. Trait S.D. 

Objectivity 4.7 2.0 3.5 21 

Social Dominance 6.8 2.4 2.4 

Extraversion 4.2 1.7 4.2 2.8 

Drive 6.2 1.6 1.9 

Detail 4.8 1.8 44 2.0 41 

Emotionality 4.7 2.4 4.1 2.5 3.5 : 
Adjustment. (poor) 3.8 2.2 3.8 2.6 3.4 J 3.7 


were observable trends in the alternative few in III. 


variables noted, i.e., age, education, and oc- 
cupation. These results are shown on Table 
2 for age, Table 3 for education, and Table 4 
for occupation. 

This analysis indicated that the hierarchy 
trend for adjustment trait scores might have 
been due to education alone and that the 
trends for detail and emotionality trait scores 
might have been due to both education and 
age. Occupational differences were too small 
to admit of much possibility for causing the 
observed hierarchy trends. 

The second step in testing the independence 
of hierarchy as the trend producing variable 
was a double classification F test analysis of 
variance pairing hierarchy with each of the 
variables—age, education, occupation, and 
objectivity.° This would show, if the data 
were adequate, whether trait score trends for 
hierarchy were independent of trait score 
trends for these four variables, co-existent 
with them but still independent, or interact- 
ing with them to produce the over-all effect 
labeled “hierarchy trend.” 

Several difficulties occurred in the execu- 
tion of this procedure because of the unequal 
distribution of two of the four variables 
throughout the hierarchy. Only two gradua- 
tions of age could be used—30—40 years and 
40 years and up, because no cases under 30 
years were in hierarchy levels I and II, and 


® Because the subclasses contained widely differing 
numbers, the “disproportionate subclass method” for 
treating data classified in unequal numbers of items 
was used (4, 235-240). 


Also, only three hierarchy levels 
could be paired with occupation because oc- 
cupational specialization is infrequent in hier- 
archy I, which is composed of cases in gen- 
eral management at the officer level and be- 
cause at level V, the lowest in the hierarchy, 
the five occupational groups of production su- 
pervision, sales, accounting, design engineer- 
ing, and industrial engineering, give way to 
hourly rate jobs with skill, craft, clerical, or 
service classifications. 

This shrinkage in the number subclassifica- 
tion for one of the paired variables combined 
with the shrinkage in the number of cases in 
each cell because of the failure to use all of 
the data in the age and occupation pairings 
with hierarchy made it much more difficult to 
establish statistically significant independent 
trends with these two variables than with the 
other two variables, education and objectivity. 
As a result of these technical difficulties, the 
double classification analysis of variance tech- 
nique used in this study is thought to be 
valid only on the positive side. That is, 
where the double classification F test shows 
the independence of the hierarchy trend from 
one of the four paired variables for a given 
trait, that can be accepted as proof positive 
of independence. But where independence 
of hierarchy trend is not shown by this tech- 
nique, the matter is still open to subsequent 
proof or disproof using more cases and more 
numerous subclassifications of the paired 
variables. 

The double classification F test pairing 





Personality Test in Management Hierarchy 


Table 3 


E. Q. Test Trait Score Means and Sigmas 
‘cording to Education Level 


High 
School 
N = 179 


E. Q. Trait M S.D. 
44 1.8 
6.5 2.3 
3.8 17 
6.2 1.9 
4.8 1.8 
4.6 2.3 
4.4 2.2 


Objectivity 

Social Dominance 
Extraversion 
Drive 

Detail 
Emotionality 
Adjustment (peor) 


Education Level * 


College 
Graduate 
N = 193 


2 Years 
College 
N = 85 


M S.D M 
4.7 
64 
3.7 
5.9 
44 
3.7 
3.4 


~ 


4.3 
7.2 
4.2 
a4 
3.9 
3.6 
3.0 


Renee eK KS we 
aaeunst = 


-_ 
—) 


* High School = High School graduation or less; 2 Years College = 1 or 2 years college; and College Graduate 


= 3,4 or more years of college. 


hierarchy in turn with age, education, objec- 
tivity, and occupation failed to demonstrate 
any significant trend independence for drive 
and extraversion. Adjustment was shown to 
have significant trend independence at the 
5% level or better when hierarchy was paired 
with age, education, and objectivity; emotion- 
ality and social dominance had significant 
trend independence when paired with age 
and education; detail had significant trend 
independence when paired with education and 
objectivity. 

The complete absence of independent trends 
for the occupation variable when paired with 


hierarchy is probably due to the small vari- 
ance of trait scores among occupations. The 
complete absence of independent hierarchy 
trends when paired with occupation must be 
discounted as possibly due to a limited num- 
ber of hierarchy subclasses in this pairing, 
classes II-IV. Also, the absence of some ex- 
pected independent trends for the age, edu- 
cation and objectivity variables must be dis- 
counted as possibly due to the limited num- 
ber of subclassifications for these variables— 
2 for age and 3 for education and objectivity 
as compared with 5 for hierarchy. Keeping 
the above technical limitations in mind, cer- 


Table 4 


kK. Q. Test Trait Score Means and Sigmas 
According to Occupation 


A 
N = 63 


E. Q. Trait M SD. M S.D 
Objectivity 4.1 
Social Dominance 6.3 
39 
6.0 
44 
40 
Adjustment (poor) 4.1 


48 19 
67 3223 
aa. Ee 
2.2 a on 
45 2.2 
2.3 41 24 
2.0 38 2.3 


Extraversion 
Drive 
Detail 
Emotionality 


Occupation * 


M 


38 2.1 
7.1 j 6.7 2.2 
4.4 39 16 


4.3 2.0 
67 2A4 
4.8 1.8 

5.3 1.8 6.3 1.7 
4.3 4.3 1.8 4.3 2.1 
4.1 3.6 2.3 3.9 2.5 
3.7 3.3 Ae 39 2.2 


*A = Production Supervisors; B = Design Engineers; C = Salesmen; D = Accountants; E = Industrial 


(Production) Engineers. 
gl 





78 Henry D. Meyer and Glenn L. Pressel 


tain statements may be made about the ob- 
served valid trends in E. Q. test trait scores 
previously presented in Table 1. 


Summary of Primary Results 


1. The trend for higher social dominance 
trait scores as the hierarchy ascends is: (a) 
independent of age; (b) independent of edu- 
cation; (c) not proven to be independent of 
objectivity; and (d) not proven to be inde- 
pendent of occupation. 

2. The trend for lower detail scores as the 
hierarchy ascends is: (a) not proven to be 
independent of age; (b) the result of interac- 
tion of hierarchy and education even though 
there is some other quantitative degree of the 
hierarchy trend which is independent of edu- 
cation; (c) independent of objectivity; and 
(d) not proven to be independent of occupa- 
tion. 

3. The trend for lower emotionality scores 
as the hierarchy ascends is: (a) independent 
of age; (b) independent of education; (c) 
not proven to be independent of objectivity; 
and (d) not proven to be independent of oc- 
cupation. 

4. The trend for lower, i.e., better adjust- 
ment scores as the hierarchy ascends is: (a) 
independent of age; (b) probably independ- 
ent of education (very close to 5% level of 
confidence) although there is a similar reduc- 
tion trend with increasing education that is 
independent of hierarchy; (c) a result of the 
interaction of the hierarchy and objectivity 
variables even though there is some other 
quantitative degree of the hierarchy trend 
which is independent of objectivity and some 
objectivity trend probably independent of 
hierarchy (very close to 5% level of confi- 
dence); and (d) not proven to be independ- 
ent of occupation. 

Secondary Results. A number of trends in 
trait scores were observed for the variables 
of age, education, objectivity, and occupation 
alone. Of these, only age was tested for va- 
lidity of trend by a single classification F test 
analysis of variance. This was done because 
the double classification F test pairing age 
with hierarchy was felt to be inadequate be- 
cause the full age range of the data could not 
be used. ‘The single classification F test for 
age alone showed a valid trend, at better than 


the 5% level of confidence, for detail trait 
score means of successively older age groups 
to decline. The previously observed trends 
for extraversion (sociability), emotionality 
and drive trait scores to decline with increas- 
ing age (Table 2) were not marked enough 
to prove themselves valid at the 5% level of 
confidence. 

Increasing amounts of formal education 
gave trait score trends of lower (poor) ad- 
justment, emotionality, detail, and drive trait 
score means (Table 3). These trends were 
not tested for validity by the single classifi- 
cation F test because the double classification 
F test pairing education and hierarchy was 
thought to be adequate. As indicated previ- 
ously, only the trend for adjustment scores to 
decline with education proved valid at the 
5% level of confidence. 

Occupational differences in trait score means 
occurred but could not be called trends (Ta- 
ble 4). The differences were not great enough 
and the number of cases in each occupational 
group was too small to establish their validity 
by statistical procedures. Salesmen and in- 
dustrial engineers were highest in traits of 
extraversion (sociability) and drive. Design 
engineers were lowest in extraversion and 
highest in objectivity. Accountants were low- 
est in objectivity and (poor) adjustment, the 
latter probably because of the former. Pro- 
duction supervisors were highest in adjust- 
ment. There were no marked differences 
among the five occupational groups segre- 
gated in the trait scores of detail or emo- 
tionality. 

Decreasing objectivity trait score groups 
gave trait score trends of higher extraversion 
and lower detail, emotionality, and (poor) 
adjustment trait scores. Only the (poor) ad- 
justment trait score trend for objectivity dif- 
ferences proved valid in the double classifica- 
tion F test pairing objectivity and hierarchy. 
The use of a larger number of objectivity 
subgroupings and more cases would be re- 
quired to clarify whether the traits of domi- 
nance, detail and emotionality also have valid 
trends with objectivity. 


Discussion 


Inasmuch as two of the failures of traits to 
establish the independence of their hierarchy 





Personality Test in Management Hierarchy 79 


trends occurred when hierarchy was paired 
with objectivity, it is of utmost importance 
to recognize that objectivity trait score means 
were practically constant for all five grades 
of the hierarchy. Hence, while it can be 
concluded that hierarchy trends will not oc- 
cur in emotionality or dominance trait scores 
when dealing with only high or low objec- 
tivity score groups, it can be said that these 
effects will cancel each other out in a ran- 
domly selected sample that gives a normal 
distribution of objectivity scores. Therefore, 
in the present study, where a normal dis- 
tribution of objectivity scores occurred at all 
hierarchy levels, emotionality and dominance 
trait score trends appeared which were not 
due to objectivity differences among the hier- 
archy grades. 

The hierarchy trend for detail trait scores 
cannot be separated from the age variable 
at the present time because of the lack of 
younger people in the upper hierarchy levels. 
However, age is also closely related to ad- 
ministrative and managerial experience in the 
present sample and such experience could be 
the true variable associated with the detail 
trait hierarchy trend. A control study, keep- 
ing age constant at 35 to 45 years with the 
N at each hierarchy level ranging from 27 to 
67 cases gave a consistent hierarchy trend for 
detail of about the same magnitude as in the 
major study (Table 1) in which age was un- 
controlled. Hence the total evidence favors 
the independence of the hierarchy trend for 
the detail trait from the age variable, if not 
from managerial experience. 

It is also apparent that occupational influ- 
ences on trait scores overlie hierarchy influ- 
ences and in a few cases may exceed them. 
For example, junior salesmen had _ higher 
dominance scores than sales managers in our 
small sample of the sales occupation. Fur- 
thermore, the fact that hierarchy differences 
in trait scores were greater than occupational 
differences points up the fact that vocational 
guidance for adults, relative to tested per- 
sonality traits, has a hierarchy level dimen- 
sion which may be more discernible than the 
occupational dimension. 

Extremely interesting to the authors is the 
fact that two of the four traits showing valid 
hierarchy trends, i.e., detail and emotionality, 


showed the least differences among occupa- 
tional groups of all seven traits. Only in po- 
sitions deemed administrative, such as gen- 
eral managers, works managers, and indus- 
trial relations directors did a sharp reduction 
in emotionality and detail trait score means 
become apparent. This suggests the hypothe- 
sis that hierarchical differences are primarily 
differences in the breadth and generality of 
administrative responsibilities. The hierarchy, 
according to this hypothesis, proceeds from 
specific occupational activities to the adminis- 
tration of specific occupational activities, to 
the administration of more diverse and more 
generalized occupational activities; and rising 
in the hierarchy is favored by personality 
traits suitable in degree and kind to such ad- 
ministrative responsibilities. 

As a precaution against overgeneralizing the 
results of this study, it should be remembered 
that at every hierarchy level there was a 
nearly normal distribution of scores for every 
trait and that the observed trait trends were 
only small changes in the central tendencies 
of these distributions. That is why it was so 
difficult to establish the validity of these 
trends. It took five hierarchy levels with 
approximately 100 cases in each to do it. 
The wide dispersion of scores could be due to 
the fact that there are many other probable 
determiners of hierarchy-climb survival or 
achievement than personality trait scores. 
Intelligence, experience, education, political 
skill, competition, motivation, values, etc., 
are other variables that come to mind. Con- 
sidering these many probably contributing 
variables, it is remarkable that a brief pencil 
and paper personality test could show con- 
sistent and valid hierarchy trends in four of 
its seven trait scores. It should also be noted 
that of the two traits with the highest inter- 
correlation, social dominance and extraver- 
sion (sociability) r = .78, only one, social 
dominance, showed a hierarchy trend. This 
throws doubt on the validity of Ellis’ (2) 
implied criticism that personality inventories 
with overlapping traits are undesirable. 

Also, it should be noted with caution that 
the present study does not offer any direct 
evidence as to whether possession of these 
“trend” trait scores on the “high” side of the 
distribution at a single level of the hierarchy 





80 Henry D. Meyer and Glenn L. Pressel 


indicates that their possessors are more com- 
petent in their jobs than people on the “low” 
side within that same hierarchy level. Rather, 
the evidence is all indirect and follows a “sur- 
vival” concept based on the belief that there 
is a progressively more stringent selection for 
fewer and fewer jobs as the hierarchy ascends. 
Since the variance in four trait scores has 
been demonstrated to have hierarchy “sur- 
vival” value by this study, it can be con- 
cluded that this study has been, to some de- 
gree, successful in indirectly validating S. J. 
& H.’s E. Q. Personality test for use as one 
tool among several in their appraisal of candi- 
dates and incumbents for management posi- 
tions. Also, in a limited way, it has con- 
tributed in the broader task of isolating the 
characteristics of industrial managers. It re- 
mains for a future study to determine whether 
there are additional hierarchy trend traits and 
whether all trend traits “develop” in their 
possessors with job experience or exist full 
blown from early adulthood. 


Summary 


The traits of (poor) adjustment, emotion- 
ality, detail and social dominance as measured 


by Form B of the Employee Questionnaire, a 
brief industrial personality test developed by 
Stevenson, Jordan & Harrison psychologists 
(1), were found to have valid management 


hierarchy trends. The traits of extraversion 
(sociability), drive, and objectivity did not 
have valid hierarchy trends. There was no 
rating criterion of success or failure for the 
cases studied. Rather, current achieved po- 
sition in the hierarchy was the implied cri- 
terion since the cases studied held jobs at a 
particular level in the hierarchy. 

The criterion of validity for trend was a 
single classification analysis of variance of 
trait scores for the five hierarchy categories 
giving an F ratio at the 5‘% level of confi- 
dence or better. Also, validity was demon- 
strated by a ¢ test of the significance of dif- 
ferences between trait score means of the top 
and bottom hierarchy categories giving a ¢ 
ratio at the 5‘% level of confidence or better. 
The valid trends were such that detail, emo- 
tionality and (poor) adjustment trait score 
means decreased at each successively higher 
level of the hierarchy while social dominance 


trait score means increased as the hierarchy 
levels became successively higher. 

The industrial management hierarchy was 
divided into five levels with company officers 
and general managers at the top level and 
hourly rate employees at the bottom level. 
One hundred cases at each of the five levels 
except the top level, which had 57, were uti- 
lized in the study. 

The valid hierarchy trends for the four 
traits mentioned were found by a double 
classification analysis of variance technique 
to exist independently of the variables of age, 
education, and objectivity at the 5(% level of 
confidence or better with the following excep- 
tions. The detail trait trend was not proven 
to be independent of age. The social domi- 
nance and emotionality trait trends were not 
found to be independent of objectivity. The 
latter exception was not held to be a major 
defect in demonstrating hierarchy trends be- 
cause objectivity trait score means were prac- 
tically equal at all five hierarchy levels. 

Miscellaneous trait trends were observed 
for the variables of age, education, objec- 
tivity, and occupation alone. But these were 
not a major aspect of study and were not 
tested statistically. Hierarchy differences in 
personality trait scores were generally greater 
than occupational differences. 

For all traits there was a substantial and 
normally distributed dispersion of scores 
around the mean at every level of the hier- 
archy indicating the probable participation of 
many variables in addition to personality test 
scores in determining the hierarchy levels of 
the cases studied. 


Received May 27, 1953. 


References 


1. Carr, E. R., and Rothe, H. F. Validity of an ob- 
jectivity key on a short industrial personality 
questionnaire. J. appl. Psychol., 1950, 34, 
178-181. 

Ellis, A. Recent research with personality inven- 
tories. J. consult. Psychol., 1953, 17, 45-49. 
Mitchell, M. B., and Rothe, H. F. Validity of an 
emotional key on a short industrial person- 
ality questionnaire. J. appl. Psychol., 1950, 

34, 329-332. 

Rothe, H. F. Use of an objectivity key on a 
short industrial personality questionnaire. J 
appl. Psychol., 1950, 34, 98-101. 

Snedecor, G. W. Statistical methods (4th Ed.). 
Ames, Iowa: Iowa State College Press, 1946, 





Tue JourNat or 
Vol. 38, N 


AppLiep PsycHoLocy 
No. 2, 1954 


Temperament Measures in Industrial Selection ' 


Frederick Herzberg 


Psychological Service of Pittsburgh and University of Pittsburgh 


Psychologists have reached an advanced 
stage in the development of test measures 
which can be applied to the problems of in- 
dustrial selection. The least dependable of 
these measures lies today in the area of per- 
sonality and temperament assessment. One 
of the major reasons for the wariness with 
which industrial psychologists approach tem- 
perament inventories is the transparency of 
such tests and their corresponding amena- 
bility to faking or pointing answers to achieve 
a desired result. Many studies have demon- 
strated the possibility that this faking can 
occur (1, 2, 3,4,5). These studies, however, 
have generally been based on artificial situa- 
tions in which college students have been in- 
structed to attempt such faking. 

One may ask, as Guilford does, in his 
manual for the Guilford-Zimmerman Tem- 
perament Survey, whether such faking or 
pointing occurs in the actual situation. And 
if faking does occur, to what extent does it 
negate or limit the use of the test for indus- 
trial selection purposes? 

Two hypotheses were examined in this 
study relating to the existence in the employ- 
ment situation of such manipulation of test 
responses. The first hypothesis is that the 
distribution of the Guilford-Zimmerman Tem- 
perament scores will be sigiuificantly higher 
for persons tested in the industrial situation 
than the distribution of scores for either col- 
lege students or of clients seen for vocational 
guidance. 

It is suggested here that these three groups 
have three different motivations for taking 
this test. The motivation for the industrial 
group is to get a job or to get promoted. The 
motivation of the vocational counseling sub- 
jects ostensibly is to gain information about 
their abilities and job opportunities, while the 
college students’ reason for taking the test is 


' This research was supported by a grant from the 
Buhl Foundation. 


81 


basically an academic one of pleasing the in- 
structor or participating in an experiment. 
The second hypothesis is that in industrial 
testing, where faking is expected to occur, the 
educational level of the examinees will affect 
the extent of such faking; i.e., the higher the 
education the higher will be the score dis- 
tributions. This is suggested by the fact that 
the higher educational groups have more gen- 
eral intelligence to understand the implica- 
tions of the items and more test sophistica- 
tion from their longer academic experience. 
The Guilford-Zimmerman test was chosen 
for this study because it is one of the most 
widely used personality inventories of the 
non-psychiatric type. Its avoidance of psy- 
chiatric terminology and goals make it more 
applicable in industrial persorinel work. 


Method 


Population. The industrial group (those 
tested for employment, promotion, or com- 
pany personnel survey purposes) consists of 
a total of 924 cases, of which 338 are college 
graduates, 128 have had 1—3 years of college 
education, 353 are high school graduates, and 
105 have only elementary school education. 

The self-referral group (vocational guid- 
ance clients) contains 94 college graduates 
and 56 high school graduates. 

The college group (University of Pitts- 
burgh students in Introductory Psychology 
classes) consists of a total of 109 students 
approximately equally distributed among the 
four years of freshman to senior. 

All subjects are males. 

Analysis. Frequency distributions and basic 
distribution statistics were computed for each 
of the “motivation” and educational groups. 
There was considerable skewness in many of 
the distributions with corresponding unequal 
variability between the comparison groups. 
All distributions were unimodal and plots 
showed that the differences between groups 
lay in higher scores for one distribution as 





Frederick Herzberg 


Table 1 


Summary of Significant Differences Between Means of Guilford-Zimmerman Scales for Groups Studied 


Industrial 
College 
Graduates 
vs. 
Industrial 
High School 
Graduates 


Scales 
Gen. Activity 
Restraint 
Ascendance 
Soc iability 
Emot. Stability 
Ob.ectivity 
Fri, adliness 
The ghtfulness 
Per,,.nal Rel. - 
Max(ulinity ” 


ts 

e _—? e > 
* Ditgerence between means significant at .05 level of confidence. 
en means significant at .01 level of confidence. 


** Diifrence bei 
t 

opposed fo the other. The differences be- 
tween the means of the various groups were 
tested for significance by Student’s ‘“t” ratio. 

The .01 level of confidence was accepted. 
There were significant age differences be- 
tween some of the groups but this was proved 
not to be a pertinent variable in this study. 


Results 


Table 1 presents a summary of the signifi- 
cant differences found between the means of 
the groups studied on the ten Guilford-Zim- 
merman scales. , 

A comparison on the basis of education 
within the industrial group shows higher 
mean scores for each scale on the Guilford- 
Zimmerman with increasing education from 
grammar school through high school to col- 
lege graduation. Scales for which the differ- 
ences between high school and college edu- 
cation are significant at the .01 level of 
confidence are Ascendance, Sociability, Emo- 
tional Stability, Objectivity, Personal Rela- 
tions, and Masculinity. All these differences 
with the exception of Masculinity are signifi- 
cant at beyond the .001 probability level. 

Two differences at the .01 level (Obdjec- 
tivity and Personal Relations) occur between 
college graduate self-referrals and the high 


Self-Referral 
College and Pitt 
Graduates 


vs. 
Self-Referral 


High School and 
Graduates 


Industrial 
College 
Graduates 


Pitt Freshmen Total 
Sophomores Sample 
vs. vs. 
Pitt Juniors Industrial 
Non-College 
Graduates 


vs. 
Self-Referral 
College 
Graduates 


Seniors 


** 


school graduate self-referral sample. Only 
for the Masculinity scale is there a signifi- 
cant difference between University of Pitts- 
burgh freshmen and sophomores and Pitts- 
burgh juniors and seniors. All three differ- 
ences are again in the direction of larger 
means for the higher education groups. 

In order to compare the industrial popula- 
tion with an academic motivation group the 
norms provided by Guilford could have been 
utilized. However, since our industrial sub- 
jects are from Western Pennsylvania and the 
manual norms are based upon California col- 
lege students, it was decided to gather norms 
on an equivalent Pittsburgh college popula- 
tion. University of Pittsburgh norms are 
found to be essentially similar to those re- 
ported by Guilford with the exception of a 
higher Masculinity score for Pitt students. 
The Pittsburgh college group was then com- 
pared with non-degree college education in- 
dustrial cases. This latter group was chosen 
for comparison with the Pitt students in order 
to equate for the education level which was 
found to be of significance for the industrial 
subjects. The norm values for the non-de- 
gree college industrial sample are found to be 
approximately midway between the norms for 
high school graduates and college graduates. 





Temperament Measures in Industrial Selection &3 


Significant differences at the .01 or better 
level of confidence differentiate these two 
groups in favor of the industrial population 
on the Restraint, Sociability, Emotional Sta- 
bility, Objectivity, Friendliness, and Personal 
Relations scales. A higher Ascendance mean 
was significant at the .05 point. 

Comparing the college graduate industrial 

, group with college graduate counseling clients, 
we find the means of General Activity, As- 
cendance, Sociability, Emotional Stability, 
Objectivity, and Personal Relations scales all 
to be significantly different. These differ- 
ences are once more in the predicted direction 
of higher scores for the industrial cases. 


Discussion 


These results support both of the hypothe- 
ses stated in the introduction. The industrial 
population for equivalent educational level 
have higher means on most of the scales of 
the G-Z than do corresponding academic and 
counseling client samples. In addition, the 
educational differences occurred primarily 
with the industrial samples. The hypothesis 
that faking or pointing of personality tests 
does actually occur in the industrial situation 
is well substantiated by these data; first, by 
their higher scores, and second, by the rein- 
forcing of this finding with the educational 
differences obtained. It seems reasonable 
therefore to conclude that clients for employ- 
ment or promotion do fake their test re- 


sponses and this occurs to a greater extent at 
the higher educational levels. 

As to the question raised in the introduc- 
tion regarding the significance of such point- 
ing on the usefulness of the test, one need 
only examine norms based upon a college 
graduate industrial sample. For the Socia- 
bility scale, the median will fall at a score of 
25 on a 30 item scale, ie., one-half of the 
group will achieve scores of five-sixths or 
more of the possible number of items in- 
cluded in that area. Similar results are found 
for the Emotional Stability, Objectivity, and 
Personal Relations scales. 

Perhaps the nature of the distribution of 
G-Z scores which are obtained in employ- 
ment testing is best illustrated by presenting 
the percentile ranks of scores for the groups 
studied which are equivalent to the medians 
on the manual norms. These percentile ranks 
appear in Table 2. 

The median scores, for example, on the 
published norms for the Emotional Stability 
and Personal Relations scales fall at the fif- 
teenth percentile when computed from a sam- 
ple of industrial college graduates. The other 
scales show similar discrepancies in the me- 
dian values. The equivalence of the Pitt 
sample to Guilford’s California college popu- 
lation is shown in the last column of Table 2. 

With such extreme “piling-up” it is diffi- 
cult to conceive of the meaning of a high 
score on these scales, much less to utilize 


Table 2 


Percentile Ranks* of Scores for the Groups Studied Which Are Equivalent to the Medians on the Manual Norms 


Industrial 
High School 
Graduates 


Industrial 
College 
Graduates 


40 40 
a0 30 
25 40 
20 30 
15 25 
20 45 
35 35 
T 40 40 
r 15 30 
M 40 50 


* To the nearest fifth percentile 


Self-Referral 
High School Pitt 
Graduates Students 


Self-Referral 
College 
Graduates 


65 55 50 
40 50 45 
50 60 40 
40 45 
40 60 
40 60 
45 50 
40 

30 

55 


ne 
=~ = 


ee ee 
rua 





84 Frederick Herzberg 


them. When one considers the curvilinear 
use of these tests, as recommended by Guil- 
ford, he would have to reject half the appli- 
cants on those scales where a high score is 
considered a drawback. 

These extreme results apply mostly to the 
use of the test with a college graduate indus- 
trial population. But this is just the popu- 
lation wherein the need for such a tempera- 
ment evaluation is greatest. The top level 
jobs involving supervision and personal rela- 
tions usually are held by college graduates, 
increasing the need of some assessment of 
their personality characteristics. 

Received May 14, 1953. 


5. Wesman, A. G. 


References 


. Cofer, C. N., Chance, June, and Judson, A. J. A 
study of malingering on the MMPI. J. Psy- 
chol., 1949, 27, 491-499. 

. Hunt, H. F. The effect of deliberate deception 
on Minnesota Multiphasic Personality Inven- 
tory performance. J. consult. Psychol., 1948, 
12, 396-402. 

. Kimber, J. A. M. The insight of college students 
into the items on a personality test. Educ. 
psychol. Measmt., 1947, 7, 411-420. 

. Longstaff, H. P., and Jurgenson, C. E. Fakability 
of the Jurgenson Classification Inventory. J. 
appl. Psychol. 1953, 37, 86-89. 

Faking personality test scores in 

a simulated employment situation. J. appl. 

Psychol., 1952, 36, 112-113. 





Tue Journal or APPLirp PsycHoLocy 
Vol. 38, No. 2, 1954 


A Validation Study of the Worthington Personal History Blank 


John G. Clark and W. A. Owens 


Iowa State College, Ames, lowa 


The Worthington Personal History Blank 
(hereafter.—PH) consists of an unstructured 
4-page application blank which is used as a 
projective technique in industrial selection. 
Some evidence for the validity of PH has ap- 
peared in the form of “testimonials” from 
satisfied users. Most of these have been fa- 
vorable. Somewhat more empirical evidence 
has come from Worthington in his doctoral 
dissertation (5) and in a recent article (3). 
The former offers evidence of a favorable 
comparison (approximately 87°% agreement) 
between PH analyses and psychiatric diag- 
noses for ten V.A. Mental Hygiene Clinic pa- 
tients. The latter article indicates that PH 
was useful in predicting effectiveness of sales- 
men for a light manufacturing company, as 
indicated by biserial correlation coefficients 
of .34 with tenure and .31 with sales volume. 
Swint and Newton (4) reported that in the 
prediction of supervisory potentiality, the PH 
was accurate in 85°% of the cases. 

Since a single PH analysis costs in the 
neighborhood of $40.00, its use would be 
justified only if its efficiency were consider- 
ably greater than that of conventional, less 
expensive instruments. It is, therefore, the 
purpose of this study to compare PH and ob- 
jective tests with respect to their re‘ative effi- 
ciency in predicting associates’ ratings in an 
industrial situation. 


Method 


The subjects of the present study were 47 
employees of an Iowa publishing company. 
They were originally selected by the employ- 
ment manager as having rather distinctive 
and unusual personalities lending themselves 
to easy PH diagnosis and to simple inspec- 
tional checks on the accuracy of the proto- 
cols. In order to make possible a more 
exacting test of the potential value of the 
instrument, the problem was subsequently 
presented to the Department of Psychology 
at Iowa State College as a possible thesis 


project. The department approved it as 
such, and it was decided that the most con- 
venient procedure for evaluation would be to 
compare the validity of the PH, against a 
criterion of associates’ ratings, with the va- 
lidity of certain standardized tests, against 
the same criterion. 

In addition to PH analyses for each sub- 
ject, percentile ranks were available on speed 
and power measures of intelligence (The 
Wonderlic Personnel Test and The Person- 
nel Laboratory’s Employment Test), on the 
Thurstone Temperament Schedule, and on 
the traits “Dominance” and “‘Self-Sufficiency” 
from the Bernreuter Personality Inventory. 

A five step criterion rating scale was con- 
structed, the traits included being selected on 
the bases of ease of rating and commonality 
with both PH and test results, particularly 
the former. 

PH reports were transformed into quanti- 
tative terms by five experienced psycholo- 
gists. These judges decided, on the basis of 
PH reports, whether a given subject should 
be classified as “high” (+) or “low” (—) 
with respect to each of the traits under con- 
sideration, and the score assigned reflected 
the degree of their agreement. Literally, a 
six point scale was provided, ranging from 
five +’s to no +’s. It was unnecessary to 
perform this operation on the PH estimates 
of intelligence, since these were reported in 
terms of estimated Wechsler-Bellevue intelli- 
gence quotients. 

The criterion ratings were made by two 
raters per subject. Since the subjects were 
scattered throughout the company concerned, 
it was impossible to obtain more ratings per 
individual or to have each rater rate all of 
the subjects; and the ratings were, therefore, 
simply pooled. 

Since the test scores were reported only in 
centile ranks and could not be assumed to be 
normally distributed, the correlation among 
the variables was estimated in terms of con- 





John G. Clark and W. A. Owens 


Table 1 


Agreement among Judges on Quantification 
of PH Reports 


Degree of 
Agreement” 
87% 
74% 
74% 
91% 
59% 
84% 
57% 
78% 
89% 
72% 
74% 


Degree of 
Agreement" 
68% 
45% 
45% 


Traits 
Activity 
Vigorous 
Impulsive 
Dominant 42% 
Stable 38% 
Sociable 42% 
Reflective 19% 
Self-sufficiency 38% 
Job-effectiveness 70% 
Promotion possibilities 42% 
Adjustment to others 36% 


* Per cent of cases in which all five judges agreed. 
» Per cent of cases in which four of the five judges 
agreed 


tingency coefficients.' The following com- 
parisons were made: Comparison I: PH vs. 
ratings; Comparison II: Test results vs. rat- 
ings; Comparison III: PH vs. test results; 
and Comparison IV: PH vs. ratings as com- 
pared with test results vs. ratings. 

The significance of the difference between 
the two series of contingency coefficients in 
Comparison IV was tested through the use of 
a randomization test (1). 


Results 


The degree of agreement among the judges 
who quantified the PH reports is shown in 
Table 1. 

Table 2 shows the retest reliabilities of the 
ratings. 

Tables 3, 4, 5, and 6 show the results of 
the four comparisons. The only significant 
relationship found in the first three compari- 
sons was that between PH and the Thurstone 
Temperament Schedule on the trait “Soci- 
able.” This coefficient was significant at the 
.O1 level of confidence. 

Comparison IV, the crucial one (Table 6) 
shows consistently higher coefficients of rela- 
tionship between objective test results and 
ratings than between PH and ratings. The 
randomization test indicates a probability of 


' Test scores were grouped by deciles. 


Table 2 
Retest Reliabilities of Ratings 
Note: 5% level, r = .29; 1% level, r = .37. 


Trait ri 
Activity 68 
Decisiveness 46 
Dominance 63 
Personal Adjustment 90 
Sociability 62 
Job-effectiveness 79 
Promotion possibilities 93 
Adjustment to others 91 


.06 that five such differences in the same di- 
rection could occur by chance. 


Discussion 


The apparent, consistent superiority of the 
objective tests, as indicated in this investiga- 
tion, constitutes damaging evidence as to the 
usefulness of PH. These results, while not in 
accord with those of the previously mentioned 
studies of the PH, do tend to follow the pat- 
tern of dubious or negative results found in 
validational studies of other projective tech- 
niques (2). 

Although not of major importance to the 
present study, the lack of significant relation- 
ship to the criterion on the part of both struc- 
tured and unstructured techniques is of in- 
terest. There are certain methodological 
problems involved which may, in part, ac- 
count for this lack of relationship. These 
are centered around: (a) the criterion rat- 
ings; (b) the quantification of PH; (c) the 
selection of tests; and (d) the nature and 
number of subjects. : 


Table 3 


Coefficients (C) PH vs. Ratings 


Contingency 


Cc 


Impulsive 

Dominant 

Stable 

Sociable 
Job-effectiveness 
Promotion Possibilities 
Adjustment to Others 





Validation Study of the Worthington Personal History Blank 


Table 4 


Contingency Coefficients (C) Test Results vs. Ratings 


Traits Cc P 
Active 
Impulsive 
Dominant 
Stable 
Sociable 


753 
749 


>.05 
>.05 
740 > .95 
733 > .05 
746 > .05 
The main problem with the criterion has 
already been mentioned. Regardless of the 
absence of certain refinements, however, there 
is no reason to believe that the ratings were 
any poorer criteria for PH than for the tests. 
The method of quantification of PH may 
be a source of error. However, the selection 
of the criterion traits was made on a basis 
which should not only be fair to PH but rea- 
sonably important to the selection process. 
The general agreement among judges indi- 
cates that the quantification of PH reports in 
terms of the selection of traits was adequately 
accomplished. Surely it would seem reason- 
able to assume that if a group of experienced 
psychologists encountered difficulty in mak- 
ing a useful interpretation of PH reports, per- 
sonnel workers would also have considerable 
difficulty in making a valid decision on the 
basis of the information contained in them. 
Specifically, if the method of quantification 
were to be suspected as a cause of error, one 
would not expect to find this factor operat- 


Table 5 


Contingency Coefficients (C) PH vs. Test Results 


Traits 
Intelligence 
Speed 688 
Power 7157 
Active 085 
Vigorous 046 
Impulsive 709 
Dominant ! 673 
Stable 694 05 
Sociable 787 01 
Reflective .698 > .05 
Self-sufficiency — .702 > .05 


1 From the Thurstone Temperament Schedule. 


Table 6 


Contingency Coefhicients (C) PH vs. Ratings Compared 
with Test Results vs. Ratings 


Test 
Traits Results Ditlerences 
Active - 53 148 
Impulsive j 749 094 
Dominant 740 O86 
Stable 733 057 
Sociable 746 161 


ing in the case of the comparison between PH 
estimates of intelligence and the intelligence 
test results, since the PH estimates were given 
in terms of Wechsler-Bellevue intelligence 
quotients. Table 5 indicates no tendency 
for the relationship between PH and the test 
results to be higher in the case of intelligence 
than in that of other traits. In fact, the only 
significant relationship obtained appears on a 
trait which was quantified by the judges (so- 
ciability ). 

To continue, only test scores already avail- 
able could be utilized, and certain traits meas- 
ured were not well adapted to use in a rating 
scale. The effect, however, would only be to 
minimize the apparent effectiveness of the ob- 
jective measures and to make the PH look 
relatively better. 

Finally, the subjects were neither a large 
nor a random sample from the industrial 
population. Since they were originally se- 
lected because they possessed some outstand- 
ing traits or characteristics (an advantage 
for the PH analyst if he were aware of it), 
they are by definition a very heterogeneous 
group. This great variability among them 
has undoubtedly operated to inflate all the 
estimates of association herein reported. It 
thus seems that with comparable subjects 
and a larger N many of the criterion relation- 
ships of Tables III and IV would have been 
significant. It is difficult, however, to im- 
agine that the apparent merit of PH vs. ob- 
jective tests was largely influenced by the 
present choice of subjects. 


Summary 


An investigation was conducted to com- 
pare the Worthington Personal History Blank 








88 John G. Clark and W. A. Owens 


(PH) and objective test results with respect 
to their relative efficiency in predicting asso- 
ciates’ ratings of 47 employees‘of an Iowa 
publishing company. By and large, neither 
PH nor the objective test results were signifi- 
cantly related to the criterion ratings; how- 
ever, the coefficients of contingency for the 
comparison of objective tests with ratings 
were consistently higher than those obtained 
for the comparison of the PH with ratings. 
This difference was significant at the .06 level 
of confidence. 

It was concluded that, under the condi- 
tions of the present study, the efficacy of the 
objective tests employed was at least as great 
as that of the PH, and very probably greater. 
It would thus seem that the use of the more 


expensive PH is not warranted in terms of 
cost. 


Received May 8, 1953. 


References 


1. Moses, L. E. Non-parametric statistics for psy- 
chological research. Psychol. Bull., 1952, 49, 
133-136. 

. Schofield, W. Research in clinical psychology: 
1951. J. clin. Psychol., 1952, 8, 255-261. 

3. Spencer, G. J. and Worthington, R. E. Validity 
of a projective technique in predicting sales 
effectiveness. Personnel Psychol., 1952, 5, 
125-144. 

. Swint, E. R. and Newton, R. A. The personal 
history,—a second report. Reprinted from J. 
Ind. Train. Jan.—Feb., 1952. 

5. Worthington, R. E. Use of the personal history 
form as a Clinical instrument. Unpublished 
Ph.D. dissertation. The University of Chi- 
cago, June, 1951. 





THe Journat or Appiiep PsycHoLocy 
Vol. 38, No. 2, 1954 


A Factor-Analytic Study of Supervisory and Group Behavior ' 
Robert C. Wilson, Wallace S. High, Helen P. Beem 


University of Southern California 


and 


Andrew L. Comrey 


University of California, Los Angeles 


A series of studies designed to isolate fac- 
tors related to organizational effectiveness has 
been in progress at the University of Southern 
California (1, 2, 7). The general approach 
has involved the selection of a number of 
similar work units or organizations which 
would be divided into “high,” “medium,” and 
“low” groups with respect to criterion data 
of effectiveness. Questionnaires have been 
administered to individuals within the work 
units and the data analyzed against the cri- 
terion-to determine if the individuals in the 
more effective work units answer the ques- 
tions differently from those in the less effec- 
tive work units. 

During the course of this work, the ques- 
tionnaires employed have gone through sev- 
eral stages of development. The current 
form involves the use of groups of homo- 
geneous items or “dimensions” developed for 
the purpose of assessing characteristics of or- 
ganizations hypothesized to have some rela- 
tionship to their effective operation. 

The dimensions have been revised from 
one study to the next, usually in the direc- 
tion of increasing item homogeneity. This 
process has been carried out on the basis of 
item analysis information. Intercorrelations 
among dimension total scores, however, have 
made it clear that the dimensions overlap 
considerably, despite differences in the ap- 
parent content of the items making up those 
dimensions. 

It was decided that a factor-analytic study 
of the principal dimensions would provide 
further information about the relationships 

1This research was carried out under Contract 
N6-ONR-23815 between the University of Southern 
California and the Office of Naval Research. The 
opinions expressed are our own and are not neces- 
sarily shared by the Office of Naval Research. The 
project is directed by J. M. Pfiffner, with J. P. Guil- 


ford and H. J. Locke as associate responsible in- 
vestigators. 


among them and provide a basis for further 
revision and culling to yield a more economi- 
cal coverage of the domain. Further, it was 
believed that such a study would suggest 
areas which might be explored more fully by 
developing new dimensions. 


Procedure 


The Sample. The questionnaire was adminis- 
tered to 98 civilian journeymen at the Long 
Beach Naval Shipyard. The journeymen are 
skilled tradesmen who work on all phases of ship 
overhaul, repair, and construction for the U. S. 
Navy. Biographical data revealed that medians 
for the following variables were: (1) age, 40.2; 
(2) highest school grade reached, 11.6; (3) 
months worked for present supervisor, 12.6; (4) 
years in shipyard, 5.0; and (5) years in civil 
service, 5.8. 

The Questionnaires. Thirteen dimensions, or 
homogeneous groups of multiple-choice items, 
were factor-analyzed. The measuring instrument 
for each dimension contained six or eight items 
put in the following objective-item form: 


If some worker gets too ‘eager,’ employees 
put pressure on him to make him quit working 
so hard: 1. always; 2. usually; 3. sometimes; 4. 
rarely; 5. never. 


In this item, as with all others, the five re- 
sponse categories were arranged on a continuum 
from a response which expressed infrequency or 
very little of the variable in question, in this case 
Lack of Informal Pressures to Restrict Produc- 
tion, to a response which expressed frequency or 
a great deal of the variable in question. The re 
sponses were arbitrarily weighted from one to 
five, according to the number preceding the re- 
sponse. 

To facilitate computation of reliability esti- 
mates, the six or eight items for each dimension 
were separated into comparable halves or sub- 
dimensions of 3 or 4 items. This also supplied 
more variables for the factor analysis.* A total 


2If each dimension does represent a separate fac- 
tor, an opportunity has thus been provided for the 
factor to come out. If doublet factors appear for 
these pairs of sub-dimensions, they can be regarded 
as specific factors for this analysis 





90 Wilson, High, Beem, and Comrey 


score for each sub-dimension was obtained by 
adding the scores assigned to the responses of an 
individual for the items in that particular sub- 
dimension. 

The Variables. The sub-dimensions included 
in the analysis were composed of items designed 
to reveal two kinds of information: (1) perceived 
characteristics of the respondent’s supervisor in 
his relations with employees; and (2) attitudes 
and interactions among members of the respond- 
ent’s work group. A sample item has already 
been given for one pair of sub-dimensions, 11 
and 12, Lack of Informal Pressures to Restrict 
Production. For the other sub-dimension pairs, 
sample items are given below. When reference 
is made to a third person, e.g., “he” or “him,” 
the person referred to is the respondent’s super- 
visor. (1 and 2) Pride in Work Group: You are 
proud of the work record of your unit: not at all 

. very much; (3 and 4) Absence of Dissen- 
sion: There are people in your unit who refuse 
to speak to each other: several ... none; (5 
and 6) Friendly Group Atmosphere: There is 
friendly kidding between people in your unit: 
not much... very much; (7 and 8) Group 
Cohesion: People in your unit act as a group to 
get things they want: never . . . frequently; (9 
and 10) Intensity of Informal Control: There 
are certain workers in your unit, besides the boss, 
who seem to lead the others: not at all . . . very 
much; (11 and 12) see above; (13 and 14) Par- 
ticipation: He is willing to listen to your ideas: 
never ... always; (15 and 16) Lack of Arbi- 
trariness:; He hates to have employees disagree 
with him: always ... never; (17) Non-Appre- 
hension of Authority: * The employees seem to 
be afraid of him: very much... not at all; 
(18 and 19) Being Informed: He passes on in- 
teresting bits of information he gets from the 
front office: never . . . very frequently; (20 and 
21) Feedback: He lets you know how you are 
doing: almost never ... very frequently; (22 
and 23) Attitude Toward Safety Enforcement: 
He tries to see that safety rules are observed: 
almost never . . . very frequently; (24 and 25) 
Social Nearness: He has close friends among his 
employees: none . . . four or more. 

In the above iterns, the response given a weight 
of “1” on the dimension appears first, with the 
response given a weight of “5” following. In- 
termediate responses have been omitted to save 
space. 

Four of the nine extracted factors gave some 
appearance of generality as regards the dimen- 
sions included in this analysis. Three of the 
factors were largely specific to a dimension pair 
and two were residual factors. 


%One of the Non-Apprehension of Authority sub- 
dimensions was not included in the factor analysis 
because of lack of item homogeneity. 


Interpretation of the Factors 


The factors are presented in the approxi- 
mate order of their clarity. Interpretations 
rest upon those variables with loadings of .40 
and above. The factor loadings of dimen- 
sions defining the factors are given in Table 1. 

Factor I. Supervisor-Subordinate Rapport. 
The dimensions with significant loadings on 
this factor seem to reflect the extent to which 
a consultative, communicative type of rela- 
tionship exists between the supervisor and 
his subordinates. The items on the Partici- 
pation dimension are principally concerned 
with the supervisor’s receptiveness to the 
ideas and opinions of his subordinates. The 
Lack of Arbitrariness dimension was intended 
to measure the degree to which the supervisor 
is not dogmatic and arbitrary regarding his 
orders and decisions. Non-Apprehension of 
Authority reflects the extent to which sub- 
ordinates are not afraid of their supervisor. 
Feedback concerns the extent to which the 
supervisor informs his subordinates as to 
what he expects of them and lets them know 
how well they are meeting his expectations. 
Being Informed concerns the extent to which 
the supervisor tells his subordinates about 
things that are going on in the organization 
which may be of interest to them. 

This factor seems to reflect the so-called 
“human relations” approach to supervision. 
The composition of this factor is quite simi- 
lar to that of a factor called “Consideration” 
reported by Fleishman (3) and Gekoski (4). 
Gekoski attributes the factor to an unpub- 
lished factor analysis by Shartle and Hemp- 
hill (5). The “Consideration” factor is de- 
scribed as including ‘behavior which is in- 
dicative of friendship, mutual trust, respect, 
and a certain warmth between the supervisor 
and group” (4). 

An alternative interpretation of the Super- 
visor-Subordinate Rapport factor is that it 
represents halo effect. Since all the dimen- 
sions appearing on the factor involve judg- 
ments about the supervisor, there is un- 
doubtedly some tendency for the respondents 
to rate a supervisor uniformly high or low. 
On the other hand, substantial correlations 
among these rating variables might well be 
expected quite apart from any halo effect, in 





Study of Supervisory and Group Behavior 91 


that the person who utilizes participation in 
supervision is less likely to be arbitrary, prob- 
ably tends to keep his subordinates informed, 
and so on. Perhaps the most likely inter- 
pretation of this factor is that it represents 
the combined effects of a common factor of 
Supervisor-Subordinate Rapport and halo. 

Factor Il. Congenial Work Group. Fac- 
tor II represents the degree to which there 
is absence of discord and the presence of 
friendly interaction among individuals and 
groups of individuals within the work unit. 
The items on Absence of Dissension are con- 
cerned with the lack of negative interactions, 
such as friction between workers, bad feel- 
ing, refusal to speak to one another, etc., 
while most of the items on Friendly Group 
Atmosphere emphasize the positive aspect of 
worker relations, i.e., friendly kidding, good 
feeling, and high morale in the unit. The 
majority of items on Lack of Informal Pres- 
sures to Restrict Production concerns the ab- 
sence of animosities toward workers who are 
more productive than the others in the group. 
If informal work standards have been gener- 
ated within the group, and social approval is 
withheld from those who do not conform to 
these norms, some antagonism or dissension 
may be present. 

Factor III. Informal Control. This fac- 
tor represents the extent to which certain in- 
dividuals outside the official chain of com- 
mand exercise influence over the work group, 
or, more generally, the degree to which the 
group contains informal status hierarchies. 
Although the dimension /ntensity of Infor- 
mal Control contains items regarding several 
aspects of the influence of fellow employees 
upon the individual, the major component 
appears to be reflected in such items as: 
“There are certain workers in your unit be- 
sides the boss who seem to lead the others.” 
Such informal or indigenous leadership in 
work groups is a common phenomenon, docu- 
mented by much research in this area. The 
loading of Participation on this factor sug- 
gests that this type of informal leadership is 
more likely to occur in a group whose super- 
visor is regarded as receptive to the ideas and 
suggestions of employees about the work. If 
the informal leadership is operating in the 


work situation, rather than simply in social 
situations, its emergence in such a situation 
may be due either to the supervisor’s delega- 
tion of authority or to his lack of vigorous 
leadership. 

Factor IV. Group Unity. Factor IV ap- 
pears to represent the degree to which the 
group works together for a common purpose 
and the extent to which the group is ready to 
take action as a unit either on behalf of one 
of its members or on behalf of the group it- 
self. Group Cohesion and Friendly Group 
Atmosphere had substantial loadings on this 
factor for both dimension halves, with the 
former dimension contributing more heavily. 
The presence of the latter dimension on this 
factor suggests that a congenial interaction 
among group members is a necessary condi- 
tion for a closely-knit work group. The pres- 
ence of Lack of Arbitrariness possibly indi- 
cates that non-dogmatic behavior on the su- 
pervisor’s part may be conducive to friendly 
informal organization among the workers. 

Factors V, Vi, and VII. These factors 
emerged as doublet factors defined by the 
three pairs of variables, Attitude Toward 
Safety Enforcement A and B, Social Near- 
ness A and B, and Pride in Work Group A 
and B. The fact that they emerged as sepa- 
rate factors is evidence that they are measur- 
ing something different from the other vari- 
ables in the analysis. They will serve the 
purpose of indicating areas in which we need 
to construct additional dimensions. In fur- 
ther analyses more general factors may occur 
in these areas. 


Summary 


Questionnaires were given to 98 skilled 
tradesmen at a naval shipyard. Items in the 
questionnaires were grouped to measure: (1) 
supervisory practices in relations with em- 
ployees such as Participation, Lack of Arbi- 
trariness, Being Informed, Feedback, Atti- 
tude Toward Safety Enforcement, and Social 
Nearness; and (2) attitudes and interactions 
of the members of the work group, such as 
Pride in Work Group, Absence of Dissension, 
Friendly Group Atmosphere, Group Cohesion, 
Intensity of Informal Control, Lack of In- 
formal Pressures to Restrict Production, and 
Non-Apprehension of Authority. Each of 





92 Wilson, High, Beem, and Comrey 


these item groups, or dimensions, was divided 
into two item pools, each pool containing 
three or four items. Twenty-five of these 
questionnaire variables were intercorrelated 
and factor analyzed by the centroid method. 
Rotation of the centroid axes to meaningful 
positions was carried out.‘ 

The variables calling for evaluation of su- 
pervisory practices all emerged on a factor 
called Supervisor-Subordinate Rapport which 
appeared to reflect a consultative, communi- 
cative type of relationship between the super- 
visor and his subordinates. 

Three other important factors were related 
to relationships among the workers them- 
selves, Congenial Work Group, Informal 
Leadership, and Group Unity. The first of 
these was concerned primarily with the de- 
gree to which there is informal leadership in 
the group, and the third reflected the extent 


*To reduce printing costs, the tables of intercor- 
relations, complete rotated and unrotated factor load- 
ings and the complete set of items (13 pages) have 
been deposited with the ADI Auxiliary Publications 
Project. Order Document No. 4116 from Chief, 
Photoduplication . Service, ADI Auxiliary Publica- 
tions Project, Library of Congress, Washington 25, 
D. C., remitting $1.75 for microfilm (images 1 inch 
high on standard 35 mm. motion picture film) or 
$2.50 for photocopies (6 X 8 inches) readable with- 
out optical aid. Advance payment is _ required. 
Make checks or money orders payable to: Chief, 
Photoduplication Service, Library of Congress. 


to which the group is unified and is ready to 
take action as a unit to get something for the 
group or for one of its members. The re- 
maining identifiable factors were doublets, de- 
fined by the paired sub-dimensions, Attitude 
Toward Safety Enforcement, Social Near- 
ness, and Pride in Work Group. 


Received June 8, 1953. 


References 


. Comrey, A. L., Pfiffner, J. M., and Beem, Helen 
P. Factors influencing organizational effec- 
tiveness. I. The U. S. Forest survey. Per- 
sonnel Psychol., 1952, 5, 307-328. 

. Comrey, A. L., Pfiffner, J. M., and Beem, Helen 
P. Factors influencing organizational effec- 
tiveness. II. The Department of Employment 
survey. Personnel Psychol., 1953, 6, 65-79. 

. Fleishman, E. A. The description of supervisory 
behavior. J. appl. Psychol., 1953, 37, 1-6. 

. Gekoski, N. Predicting group productivity. Per- 
sonnel Psychol., 1952, 5, 281-292. 

. Report Number 6, RF Project 403. Research 
Foundation, The Ohio State University, 1951. 

. Thurstone, L. L. Multiple factor analysis. Chi- 
cago: The University of Chicago Press, 1947. 

. Wilson, R. C., Beem, Helen P., and Comrey, A. L. 
Factors influencing organizational effectiveness. 
III. A survey of skilled tradesmen. Person- 
nel Psychol., 1953, 6, 313-326. 

. Zimmerman, W. S. A simple graphical method 
for orthogonal rotation of axes. Psycho- 
metrika, 1946, 11, 51-55. 





Tue Journat or Appiien PsycHoLocy 
Vol. 38, No. 2, 1954 


The Check List as a Criterion of Proficiency ' 


Arthur I. Siegel 


Institute for Research in Human Relations, Philadelphia 


The check list represents a particularly at- 
tractive tool for measuring a man’s ability to 
perform a task. A performance check list is 
prepared by analyzing a task into the com- 
ponent actions which a man performs in 
order to complete the task. In some cases the 
task involves making something. In this case 
an end product evolves and the end product 
is also analyzed in terms of its adherence to 
certain prescribed standards and its freedom 
from defect. An examinee receives credit for 
each of the analytic performance components 
with which he conforms and each of the ele- 
mentalistic aspects of his end product that 
meets prescribed standards. A total task 
score is derived by adding a man’s credits on 
each of the analytic components of perform- 
ance and end product adherence to prescribed 
standards. 

For instance, a performance check list for 
welding consists of items relating to the way 
the welder performs his job (e.g., “cleans 
base metal and rods,” “adjusts oxyacetylene 
regulators to 4-5 pounds,’ “preheats base 
metal,” etc.); the safety precautions the 
welder follows (e.g., “does not open acety- 
lene cylinder valve more than 114 turns,” 
“uses goggles when welding,” “makes sure fire 
extinguisher in area before igniting torch’’) ; 
and items relating to adherence of the final 
weld to prescribed standards (eg., “bead 
width 3-5 times metal thickness,’ “bead 
height 25-50% of metal thickness,” etc.). 
The welder is given credit for each of the 
items with which his performance or final 
weld conforms, and his total score is the sum 
of these credits. The problem, as mentioned 
by Thorndike,’ is that there may be aspects of 
a job that are lost in the analytic approach, 
so that scoring the elementary items does not 

1 The data herein reported are a small portion of 
the data gathered under Contract Nonr-872(00) be- 
tween the Institute for Research in Human Relations 
and the Office of Naval Research. The opinions ex- 
pressed are those of the author and do not neces- 
sarily represent the opinions of the Office of Naval 
Research or of the Naval Service. 


2 Thorndike, R. Personnel selection, New York: 
John Wiley, 1949 


give an entirely adequate evaluation. Spe- 
cifically, the scores obtained by subjects 
scored in an analytic manner may not cor- 
relate highly with the over-all, “clinical” 
judgments of experts as to the quality of a 
final product produced. 

If performance check list scores do cor- 
relate highly with expert, “clinical” judgments 
or rankings of end products, the performance 
“check list” is to be preferred. This follows 
since more objectivity may be introduced 
into the check list, examiner reliability may 
be increased by the check list, test reliability 
may be increased by the check list and less 
background and less experience in the par- 
ticular test task is required by the examiner 
who uses the check list than by the examiner 
who makes a “clinical” appraisal. Moreover, 
if performance in process is checked as well 
as the quality of the final product, certain in- 
sights may be gained which could be missed 
by only an. over-all, final appraisal of end 
products. 


Method 

Performance check lists * were constructed 
for four tasks: aluminum butt welding; 
patching a hole in plastic; splicing a cracked 
aircraft channel; and aircraft fabric repair. 
The inter-examiner reliabilities for both the 
aluminum butt welding and fabrics lists were 
coincidently at .92. The inter-examiner re- 
liabilities for the plastics and channel splic- 
ing lists were not ascertained. However, the 
inter-examiner reliabilities on four other check 
lists similar to the ones herein discussed 
ranged from .91 to .97. Likewise, the intra- 
examiner reliability by retest methods for 
measurements made on the adherence of end 
products to prescribed standards was ascer- 
tained only for the welding and fabrics lists. 

%To save printing costs, the performance check 
lists as well as the correlational matrices upon which 
our discussion is based have been deposited with the 
ADI Auxiliary Publications Project. Order Docu- 
ment No. 4038 from Chief, Photoduplication Serv- 
ice, ADI Auxiliary Publications Project, Library of 
Congress, Washington, D. C., remitting $1.75 for 35 
mm. microfilm or $2.50 for 6 by 8 inch photocopies 





94 Arthur I. Siegel 


These intra-examiner reliabilities were .93 and 
87 respectively. Intra-examiner reliability 
for observations made of performance in 
process is difficult to obtain by the retest 
method. This difficulty follows because of 
the relative impossibility of having an ex- 
aminee perform the same task in exactly the 
same manner on two separate occasions. 
However, by a motion picture technique, the 
intra-examiner reliability for performance in 
process was found to be .93 for the welding 
test. The intra-examiner reliabilities for per- 
formance in process of the remaining lists 
were not determined. 

The aluminum butt welding, plastic patch- 
ing, channel splicing, and fabric repair tasks 
were first administered to 15 aviation struc- 
tural mechanics at the Naval Air Technical 
Training Unit, Memphis. Four of the sub- 
jects held the Naval rate of “striker,” four 
held the rate of third class, four the rate of 
second class and three the rate of first class 
Aviation Structural Mechanic. All of the jobs 
represented in these tests are tasks which 
Naval Aviation Structural Mechanics typi- 
cally perform. The mean service length of 
the examinees was 54.6 months with a stand- 
ard deviation of 38.3 months. The mean 
scores and standard deviations for the group 
on each of the tests follow: aluminum butt 
welding, mean 15.0, sigma 6.9; splicing a 
cracked channel, mean 44.3, sigma 6.8; 
plastic repair, mean 20.8, sigma 6.5; and 
fabric repair, mean 33.4, sigma 9.0. The ex- 
aminers were Chief Aviation Structural Me- 
chanics who held instructors’ billets at the 
Naval Aviation Structural Mechanics School, 
Memphis. The examinees were unknown to 
the examiners prior to the testing situation. 

The end products produced by each ex- 
aminee on each of these tests were taken to 
the Naval Air Station, Atlantic City. At At- 
lantic City, five Chief Aviation Structural 
Mechanics were asked to rank, from best to 
worst, the end products produced by the ex- 
aminees on each of the tests. 


Results 


The correlations of the rankings of the 
Chief Petty Officers who ranked the end 
products from each test with the rankings 
produced by our analytic and synthetic ap- 


proach and also the correlations between the 
rankings of the various chiefs who ranked 
the end products at Atlantic City were calcu- 
lated. All of these correlations were rank 
difference correlations. If the experts (Naval 
Chief Aviation Structural Mechanics) agreed 
among themselves more than they agreed with 
the rankings produced by the analytic and 
synthetic approach, then the analytic and 
synthetic approach has lost something that 
these experts considered to be important in 
making their rankings. On the other hand, 
if the experts agreed with the rankings pro- 
duced by the check list as much as they 
agreed among themselves, then little has been 
lost by the analytic and synthetic approach. 

The median rank difference correlation be- 
tween the rankings of the chiefs at Atlantic 
City of the end products from each test was 
then obtained. The median rank difference 
correlation of the chiefs’ rankings with the 
rankings produced by our analytic and syn- 
thetic check list approach was also calculated. 
These rhos are presented in Table 1. 

For the welding test, the median rank dif- 
ference correlation of the chiefs’ rankings — 
with the rankings produced by the analytic 
and synthetic approach was .41. For the 
plastics test, the structural maintenance test 
and the fabrics test, the median rank differ- 
ence correlations of the chiefs’ rankings with 
the rankings produced by the analytic and 
synthetic approach were .66, .26 and .33, re- 
spectively. For the welding test, the plastics 
test and the structural maintenance test, the 
median rank difference correlations between 
the chiefs’ rankings were .95, .89, and .29, 
respectively, and these three rhos were greater 


Table 1 


Median Correlations Between Chiefs and Between 
Rankings of Chiefs with Analytic Approach 
Median Median Rho 
Rho of Chiefs with 
Between Analytic 
Test Chiefs Approach 


Welding 95 Al 
Plastics 89 


Structural 
Maintenance 37 


Fabrics .29 





Chech List as Criterion of Proficiency 95 


than the correlation of the chiefs’ rankings 
with the rankings produced by the analytic 
and synthetic approach. However, the re- 
verse was true for the fabrics test; median 
rho between chiefs .29, median rho of chiefs 
with analytic approach .33. All median rank 
difference correlations were then converted 
to product moment correlations and the prod- 
uct moment correlations transformed to z’s 
(r to z’ transformation). The significance of 
the difference was then calculated between 
the z’s which represented the median correla- 
tions between the chiefs’ rankings of the end 
products of each test and the z’s which rep- 
resented median correlations of the chiefs’ 
rankings with the rankings produced by the 
analytic and synthetic approach for the same 
tests. 

Of the four tests of significance calculated, 
only the difference between the median cor- 
relation between the rankings of the chiefs 
and the median correlation of the chiefs’ 
rankings with the rankings produced by the 
analytic and synthetic approach for the weld- 
ing test was statistically significant. How- 
ever, at this juncture, it is well to point out 
that three of the four median rank difference 
correlations between the chiefs’ rankings were 
greater than the median correlations of the 
chiéfs’ rankings with the rankings produced 
by the analytic and synthetic approach. 

Thus it seems that we were not able to 
demonstrate conclusively that the Chief Petty 
Officers in our sample agreed with the rank- 
ings produced by the analytic and synthetic 
approach more than they agreed with each 
other. On the other hand, in view of the bias 
(here controlled) that usually enters into 
judgments of end products, when judgments 
are made in actual work situations, it seems 
probable that any loss indicated by the three 
lower correlations of the chiefs’ rankings with 
the rankings produced by the analytic and 
synthetic approach as compared with the cor- 
relations between the chiefs’ rankings would 
be compensated for if this bias had been al- 
lowed to operate here as a confounding, vari- 
able. Moreover, some procedural elements 
were not apparent to the chiefs when they 
made their judgments of the final products. 
For instance, a man may break safety rules 
and lose a certain amount of credit. Infrac- 


tions such as these are not seen when mak- 
ing appraisals of end products, but may be 
too important to omit from consideration 
when estimating a man’s real work ability. 
Since the chiefs who did the ranking had only 
end products to evaluate, it seems probable 
that part of the loss indicated by the lower 
correlations between the chiefs’ rankings and 
the rankings produced by the analytic and 
synthetic approach as compared with the be- 
tween-chiefs’ correlations may also be as- 
signable to distortions in the rankings of the 
analytic and synthetic approach due to poor 
care and use of equipment, violation of safety 
precautions, etc. on the part of the examinees. 


Summary 


As mentioned by Thorndike, there may be 
aspects of a job that are lost in breaking 
down a job in terms of the elemental, analytic 
components comprising the job. If this is so, 
then scoring the elemental, analytic compo- 
nents of a task does not give an entirely ade- 
quate evalua'ion. Therefore, an investiga- 
tion was performed into the relationship be- 
tween total scores assigned via a scoring of 
the elemental components of a job (perform- 
ance check list approach) and over-all, “clini- 
cal” judgments of experts (Naval Chief Petty 
Officers) as to the quality of a final product 
produced. Rank difference correlations were 
calculated between the scores assigned via a 
performance check list and judgments of ex- 
perts as to the quality of final products. In- 
ter-correlations between the rankings of the 
experts on the quality of end products pro- 
duced were also calculated. The following 
conclusions seem warranted. 

1. Three out of four median correlations 
between the rankings of Naval Chief Aviation 
Structural Mechanics were not significantly 
different from correlations of the chiefs’ rank- 
ings with scores obtained by an analytic and 
synthetic approach. 

2. Although no statistical differences were 
shown in three of the four pairs of rank dif- 
ference correlations under consideration, the 
tendency was toward greater agreement be- 
tween experts’ rankings than between the 
rankings of the experts and the rankings of 
the analytic and synthetic approach. 


Received May 26, 1953 





Tue Journar or Appiiep Psycuonocy 
Vol. 38, No. 2, 1954 


Identification and Prediction of Two Training Criterion Factors ' 


Warren R. Graham 
New York, N. Y. 


The problem of identifying and predicting 
the variables which are involved in the suc- 
cessful completion of a comprehensive train- 
ing program was attacked in this study. 
Twelve examination scores from six courses 
given as part of the Naval Pre-Flight train- 
ing program were defined as the training cri- 
terion. 

The nature of the criterion was studied by 
the Thurstone centroid factor method, The 
pre-flight examinations criterion was hypothe- 
sized to be composed of two or more factors 
which could be identified and separately pre- 
dicted. An attempt was made to predict the 
factors by an adaptation of the Doolittle 
multiple correlation method. 


Procedure 


In the parent study (3) a battery of stand- 
ard tests was administered to entering stu- 
dents, and scores from these predictors were 
correlated with subsequent classroom and 
final examination grades (criterion variables). 
The present study employs the ten predictors 
which had produced regression coefficients 
indicating that they were the best predictors 
of the criterion variables. 


The sample studied consisted of 399 students 
in four classes who were homogeneous in having 
met the following requirements: (1) minimum 
physical standards; (2) minimum standards on 
a Naval Aviation selection battery; (3) comple- 
tion of two years of college; and (4) between 
19 and 25 years of age. 

The predictor and criterion variables employed 
were as follows: 


A. The Predictor Variables: 
2. Language Skills Test, G. E. D., college 
level. 
Minnesota Clerical—Name checking, 
(1946). 
. Mathematics—Diagnostic, (Ability 
perform simple computations). 


6. 


” 
/ 


to 


1 The author is indebted to Dr. Harold A. Edger- 
ton for guidance and suggestions offered during 
analysis of the data. The matetials treated herein 
were collected by Richardson, Bellows, Henry and 
Co., Inc., under Office of Naval Research Contract 
N 7 onr-383, TO-I, with the cooperation of the Spe- 
cial Devices Center and the Pensacola Naval Air 
Training Command. Interpretations and opinions 
are the author’s and should not be construed as 
having U. S. Navy endorsement. 


96 


8. Mathematics—Reasoning, (No high 
school algebra or geometry required). 

9. Mathematics—Total, (Combined score 
for tests 7 and 8). 

10. American Council on Education Psy- 
chological Exam., 1947, Q Score, (Abil- 
ity to handle quantitative and other 
nonverbal problems). 

11. Above, L Score, (Linguistic or verbal 
abilities ). 

12. Above—Total, (Combined score for 
tests 10 and 11—general academic abil- 
ity). 

13. Aviation Classification Test, (General 
academic ability). 

14. Mechanical Classification Test, (Ability 
to solve mechanical problems). 

. The Criterion Variables: 

16. (Daily),? 17. (Exam.),* Aerology, (Me- 

teorology ). 

(Daily), 24. (Exam.), Engines, (Parts, 

functions, instrument checks, weights 

and balances, etc.). 

. (Daily), 27. (Exam.), Essentials of 
Naval Service, (Regulations, military 
courtesy, leadership, etc.). 

. (Daily), 31. (Exam.), Principles of 
Flight, (Aircraft nomenclature, lift, 
drag, stability, compressibility, etc.). 

3. (Daily), 34. (Exam.), Navigation, 
Dead Reckoning, (Earth’s coordinates. 
variation and deviation, winds, fixes, 
etc.). 

35. (Daily), 36. (Exam.), Navigation, Ce- 
lestial, (Astronomy, celestial triangle, 
latitude by polarities, sextant, astro- 
compass, time and radio signals, etc.). 
Navy Grade, (a combined score for all 
criterion variables). 


Statistical Technique. The twelve criterion 
examinations were factored, yielding three fac- 
tors, of which two could be rotated to psycho- 
logically meaningful patterns of factor loadings, 
using Thurstone’s centroid method and graphic 
rotation. 

The intercorrelations of the ten predictor vari- 
ables which had been shown to be most related 
to the twelve criterion examinations in the par- 
ent study (3) were then added to the criterion 
intercorrelation matrix. This produced the rec- 
tangular intercorrelation matrix shown in Table 
1. The predictors were related to the criterion 
factors in such a way as to obtain rotated pre- 
dictor factor loadings without changing the re- 


2 Daily Grades. Averages of grades from instruc- 
tor-prepared weekly quizzes. 

* Exam. Grades. Grades on a_ two-hour 
exam. prepared by a central examining board. 


23. 


final 





‘Sandy JULIYIUTIS OM} O} 9914] Wily PadNpes puw “paz TWO sTeWIeG -9ION , 





co oe 
-— aN sh 


-_ 

mn GY 
o% + tt 
” 


<= 
an 
mow 


© 
N 


cm AN YH 
ANQANAN 
aan 
= f& 
ae) 


+ 
~* 


AOV 
TOV 
9 


[MIO L Qe 

apriry AARN 

XY ARN 
wexy -AZojolay 


[F101 - 


Apreq.-Asojol1ay 


JSSEI) YOO 
IWOIS TAOV 
WX -[%) ABN 
A[IWG-Hd ABN 
wex'y-}] Ud 
APEC -I ay Ug 


BOI @)- 


BUIUOSBIY -YIPIY 
wt 


JISSe],) uoTRIAy 
MNSOUBRICT -YILY 
SAWN 19]. UU 


SUIS Bury -y 


“ 
> 
= 
Ss) 
wy 
= 
= 
& 
3 
~ 
~ 
— 
S 
= 
Ss 
™ 
~ 
I 
_ 
2) 
LY 
~ 
NX 
2) 
= 
S 
= 
Ss 
‘—-— 
~ 
S 
- 
S 
~ 
~~ 
= 
& 
a) 
~ 


XUIVIY 10} Iprad XU}VIY UOIII) 


, PAPPY SAeUBA JOIINpaig YIM XUIP]Y UONV]AWO UOLATI) 


LAB 





98 Warren R. Graham 


sults obtained for the separate analysis of the 
criterion examinations.* 

When the factor loadings for the predictors 
had been obtained, they were considered to be 
validity coefficients between the predictors and 
the criterion factors. The Doolittle multiple 
correlation method was then employed to get a 
prediction of each of the two criteria obtained 
(the criterion factors) by each predictor in terms 
of standard partial regression coefficients (beta 
weights). The regression coefficients represent 
the degree to which a predictor will predict each 
criterion when the influence of the other predic- 
tors has been partialled out or controlled. 

Thus, instead of predicting each of the twelve 
criterion variables or a composite of them, two 
factors which economically and accurately repre- 
sent them have been predicted. 

The composite variables (Math-Total, Navy 
Grade and American Council on Education—To- 
tal) for which factor loadings were computed, 
were not included in the Doolittle solution to 
avoid contamination due to spurious correlations 
with the tests which compose them. 

No measures of the reliability of the criterion 
variables are available. 


Results 


Tables 2 and 3 present the factors which 
resulted from analysis of the criterion ex- 
aminations. Two of the three factors are 
psychologically meaningful when rotated to 
simple structure. These are: 

Factor I. Navigation. ‘Table 3 indicates 
that the highest factor loadings occur for the 
four navigation criterion examinations (vari- 
ables 33, 34, 35, 36). As shown in Table 4, 
this factor was best predicted by the Minne- 
sota Clerical Test-Name Checking, var. 6; 
the Mathematics Diagnostic Test, var. 7; 
and the American Council on Education 
Test-Q Score, var. 10. It was negatively 
predicted by the ACE Test-L Score, var. 11, 
and the Mechanical Classification Test, var. 
14. 

In general, the Navigation factor is best 
predicted by the tests of arithmetic and cleri- 
cal (name-checking) abilities, and negatively 
predicted by the tests of linguistic abilities. 

Factor II. Verbal Reasoning. Table 3 
shows high loadings for all variables except 
those of an obviously non-problem solving 
nature. This indicates that general ability 
is of importance to success in courses in the 
Pre-Flight curriculum. ‘Table 4 indicates 
that this factor is best predicted by the 

4See the Technical Appendix below for a descrip- 
tion of this technique, 


Table 2 


Centroid Factor Matrix (before rotation) 


Predictors I II 


2. Lang. Skills, GED 287 074 
6. Minn. Clerical Test, Names 257 —144 
7. Math. Diagnostic 346 —280 
8. Math. Reasoning 542 —105 
9. Math. Total 511. —220 
10. Am. Counc. Ed.—-Q Score 318 —191 
11. Am. Counc. Ed.—L Score 311 240 
12. Am. Counc. Ed.—Total 362 057 
13. Aviation Class. Test 441 023 
14. Mech. Class. Test 361 104 
Criterion 

16. Aerology Daily 673 135 
17. Aerology Exam. 648 130 
23. Engines Daily 701 159 
24. Engines Exam. 652 161 
26. Ess. Naval Serv. Daily 597 145 
27. Ess. Naval Serv. Exam. 444 158 
30. Prin. Flight Daily 636 306 
31. Prin. Flight Exam. 590 292 
33. Nav. Dead Rec. Daily 744. —317 
34. Nav. Dead Rec. Exam. 462 —435 
35. Nav. Celest. Daily 630 —372 
36. Nav. Celest. Exam. 543. —419 
39. Navy Grade 936 —090 


Mathematics Reasoning Test, var. 8, the 
Aviation Classification Test, var. 13, and the 
Mechanical Classification Test, var. 14. The 
ACE Test-L Score, var. 11, is a less signifi- 
cant predictor than any of these three tests, 
but it is considerably more effective than the 
arithmetic and clerical (name-checking ) tests. 
In general, Factor II is best predicted by tests 
calling for reasoning ability. 

The Language Skills Test, GED-College 
Level, does not contribute to the prediction 
of either factor. 

Factor III. Unidentifiable Variance. A\l- 
though this variable is not clearly interpret- 
able, there is a possibility that it reflects rela- 
tively specific variance in the examinations 
for Essentials of Naval Service, var. 26, 27. 
It might represent a personality or social fac- 
tor which has not been sufficiently deter- 
mined to be identified. 

It is concluded that successful completion 
of the Navy Pre-Flight curriculum is depend- 
ent, in part, upon ability to perform simple 
arithmetic computations accurately, such abil- 
ity being required to pass courses in naviga- 
tion. That some clerical ability is required 
is indicated by the fact that the Minnesota 
Clerical Test-Name Checking, var. 6, is a 





Identification and Prediction of Criterion Factors 99 


Table 3 


Centroid Factor Matrix (after two rotations) 


att 


—050 
—035 
090 
020 
—140 
025 
— 045 


IT 
. Lang. Skills, GED : 295 
. Minn. Clerical Test, Names . 215 
7. Math. Diagnostic 32: 280 
. Math. Reasoning ; 505 
. Math. Total F 450 
Am. Counc. Ed.—Q Score ! 275 

. Am. Counc. Ed.—L Score 5 350 
. Am. Counc. Ed.—Total 7 365 —175 
Aviation Class. Test 440 =—035 
. Mech. Class. Test 375 275 


Predictors 


Criterion 


Aerology Daily 680 = =—140 
. Aerology Exam. —010 660 072 
. Engines Daily —040 719 140 
. Engines Exam. —045 675 100 
. Ess. Naval Serv. Daily 040 610 = —245 
. Ess. Naval Serv. Exam. 000 470 —255 
. Prin. Flight Daily —205 685 165 
. Prin. Flight Exam. —180 640 090 
. Nav. Dead Rec. Daily 415 665 259 
. Nav. Dead Rec. Exam. 540 362 —023 
. Nav. Celest. Daily 455 542 230 
. Nav. Celest. Exam. 525. 450 025 
. Navy Grade 295 &95 000 


predictor of the Navigation factor. Verbal 
facility and ability to solve verbal problems 
are prerequisite to success in all academic 
courses (including navigation) in the Pre- 
Flight curriculum. 


Technical Appendix 


A correlation matrix of the criterion variables 
was prepared (Table 1—Criterion Matrix) and 
subjected to multiple factor analysis by Thur- 
stone’s centroid method. 

McNemar’s (1) criterion for the number of 
factors to be extracted (that the SD of the 
residuals should be down to or below the SD of 
the zero-order r’s) is difficult to interpret for 
this study, but it suggests that three factors, and 
possibly more, should be extracted. The diffi- 
culty of interpretation arises in determining how 


Table 4 
Regression Coefficients for the Prediction of 
Criterion Factors I and IT 


2. Language Skills, GED 

6. Minn. Clerical Test, Names 
7. Math. Diagnostic 

8. Math. Reasoning 

10. Am. Counc. Ed.—Q Score 
11. Am. Counc. Ed.——L Score 
13. Aviation Class. Test 

14. Mech. Class. Test 





low the SD of the residuals must be before it 
may be considered to meet the criterion of being 
“down to” the value for the zero-order r’s. The 
criterion used in this study was to stop extract- 
ing factors when the residuals closely approxi- 
mated zero. 

The zero-order r’s for the predictors were sim- 
ply added to the end of the square criterion 
matrix to form a rectangular matrix (Table 1). 
The predictor columns were summed and di- 
vided by the square root of the sum of the sums 
of the criterion columns for each factor, thus 
placing the predictors in the same space as the 
criterion variables. The predictor residuals were 
computed and reflected according to the reflec- 
tions determined by the criterion variables. The 
predictor factor loadings extracted (Table 2) 
were then rotated to the identical planes used 
for the criterion factors. 

The Doolittle method was used to obtain stand- 
ard partial regression coefficients to express the 
relationships between the predictors and the cri- 
terion factors. The factor loadings of each pre- 
dictor for each factor were placed in the cri- 
terion columns (i.e., two criterion columns | one 
for each factor] were used instead of the usual 
one, and the predictor factor loadings were 
treated as validity coefficients). The basic equa- 
tions were solved for the regression coefficients 
to estimate the relationship of each predictor 
variable to each factor. 

The principal advantage to be derived from 
the initial factor analysis of the criterion vari- 
ables is that the obtained criterion factors may 
be isolated free from the influence of the vari- 
ance of the predictor variables. The rotated fac- 
tor loadings of the predictors indicate which ones 
are most likely to be related to the criterion fac- 
tors. Thus, computation of regression coeffi- 
cients for irrelevant predictor variables may be 
avoided. 

Extension of the Doolittle multiple correlation 
table to permit the simultaneous computation of 
regression coefficients for the prediction of sev- 
eral criteria does not alter the results ordinarily 
obtained for any single criterion. The nature of 
the computations required perinit the prediction 
of any number of criteria once the computations 
for the zero-order inter-correlations of the pre- 
dictors are completed 


Received June 5, 1953 


References 
. McNemar, Q. On the number of factors. Psy 
chometrica, 1942, 7, 9-18 
. Peters, C. C. and Van Voorhis, W. R. Statistical 
procedures and their mathematical bases 
New York: McGraw-Hill, 1940. 

Edgerton, H. A. A study of individual differenc: 
among naval aviation student New York 
Richardson, Bellows, Henry and Co, Inc 
1949. 

Thurstone, L. L. Multiple factor analysis. Chi 
cago: University of Chicago Press, 1947, 








Tue JournaL or ApPiien PsycHoLocy 
Vol. 38, No. 2, 1954 


Rater and Technique Contamination in Criterion Ratings 


Gloria H. Falk and A. G. Bayroff ' 


Personnel Research Branch, The Adjutant General's Office, Department of the Army, 
Washington, D. C. 


An important consideration in validation 
studies is the degree to which the procedures 
for obtaining the criterion measures are in- 
dependent of the predictors. Should predic- 
tor test scores enter into the determination of 
criterion scores then the correlation between 
predictor scores and criterion scores will be 
artificially increased. Similarly, if the cri- 
terion measure is a rating, the validity will be 
inflated if the raters base their evaluations on 
prior knowledge of the predictor scores. 

When both the predictor and criterion 
measures are ratings, the problem of criterion 
contamination may be critical. It may take 
the form of rater contamination as, for ex- 
ample, when both the predictor raters and 
the criterion raters are the same persons. Or, 
criterion contamination may take the form 
of technique contamination. This form of 
contamination may exist when both the pre- 
dictor rating and the criterion rating employ 
the same rating technique. Thus, a graphic 
predictor rating may correlate more highly 
with a graphic criterion rating than would a 
forced-choice predictor rating with a graphic 
criterion rating. Since the usual criterion 
rating employs some form of graphic rating, 
the possibility exists that the graphic rating 
technique appears more valid than others pri- 
marily because both predictors and criteria 
employ the same measuring technique. 


Problem 


The subject of study described here was 
the comparative influences of two potential 


! The opinions expressed in this article are those 
of the authors and do not necessarily express the 
official views of the Department of the Army. Ac- 
knowledgment is made of the participation of vari- 
ous staff members of the Personnel Research Branch, 
particularly Edward A. Rundquist and Helen R. 
Haggerty. Acknowledgment is also made of the gen- 
erous assistance of the Commandant, Command and 
General Staff College, Fort Leavenworth, Kansas, 
and his staff, and the officer students in attendance 
during the gathering of the field data basic to the 
study. 


sources of criterion contamination,—similar- 
ity of raters and similarity of techniques. An 
attempt was made to estimate the relative 
amounts of agreement between two sets of 
ratings when the two sets involved: (a) the 
same raters; and (b) the same techniques. 


Method 


Population. The population consisted of 
400 officers (primarily majors and lieutenant 
colonels) enrolled as students at the Army 
Command and General Staff College. The 
objective of this college is to train potential 
division commanders and general staff offi- 
cers, and its students represent a highly se- 
lected group. The course was 42 weeks long 
during which the students were in close con- 
tact with one another. Each officer served as 
both rater and ratee. 

Design of the Study. This study is one 
part of a larger research program on rating 
methodology. This report will be limited to 
those aspects pertinent to the present study. 
The study permitted a comparison to be 
made of the amount of agreement between 
ratings made with identical techniques and 
those made with different techniques under 
two general conditions: (a) when the same 
raters made the two sets of ratings; and (b) 
when different raters made the two sets of 
ratings. 

Instruments. The rating techniques em- 
ployed in this study were an eight-point 
graphic scale of over-all value to the service 
and two versions of the forced-choice tech- 
nique.” The graphic scale was provided with 
descriptions for each of its eight-points, rang- 
ing from “The most outstanding officer I 
know” as point 1, to “An officer who does 
not have the calibre that one should reason- 


2 Schneider, Dorothy E. and Bayroff, A. G. The 
relationship between rater characteristics and va- 
lidity of ratings. J. appl. Psychol., 1953, 37, 278 
286. 


100 





Rater and Technique Contamination in Ratings 


ably expect in an officer” as point 8. To 
counteract the reluctance of most raters to 
use the low end of the scale, two of the points 
below mid-scale value were favorably defined, 
e.g., “An acceptable officer whose value is 
limited in some respects.” The fact that the 
entire scale was used may, in part, be at- 
tributed to this device. 

Both versions of the forced-choice tech- 
nique had identical items. In one form, the 
controlled check list (CCL), the items were 
arranged in two sets of 24 phrases each. The 
rater selected the 12 most descriptive phrases 
in each set. In the second form, the forced- 
choice pairs (FCP), the items were arranged 
in 24 pairs. Phrases in each pair had similar 
preference values, but different discrimination 
values. The raters selected one phrase in 
each pair. 

Procedure. The classes consisted of 33-40 
officers each. Each officer was assigned 20 
other officers of his class to rate on the 
graphic scale. These assignments were made 
according to a procedure which approximated 
random selection, and the order in which the 
ratings were to be made was specified. 

For purposes connected with other studies 
in this series and not relevant to this study, 
each class was randomly divided into two 
groups of raters. One group signed its rat- 
ings and was informed that these ratings 
would be available for official use; the other 
group did not sign its ratings and was told 
that these ratings would not be available for 
official use. Each ratee received half his rat- 
ings from one group and half from the other. 
The results in this study will be presented 
separately for the two groups as a partial 
replication device. 

Eight days after the graphic ratings were 
made, each rater re-rated two of his fellow- 
officers, first on one of the two versions of the 
forced-choice technique and then on the same 
graphic scale used a week earlier. 

Different Raters, Same Technique. Aver- 
age of the intercorrelation coefficients among 
graphic ratings from four different raters per 
ratee was determined. 

Different Raters, Different Techniques. Av- 
erage of the correlation coefficients between 
the forced-choice ratings on the ninth day 


Table 1 


Product-Moment Correlations Between 
Sets of Ratings 


Rating Techniques 
Same Different 
Graphic vs. 
Graphic - 
vs. Controlled 
Graphic Check 


Forced- 
choice 
Pairs 


Raters List 


Same 82 
69 


Different 30 
25 


and graphic ratings from three different raters 
per ratee on the first day was computed. The 
forced-choice ratings selected for this analy- 
sis were those made by the raters whose 
graphic ratings were omitted here. 


Results and Conclusions 


As shown in Table 1, the highest correla- 
tions were obtained for the sets of ratings 
made by the same raters using the same tech- 
niques (r = .82, .69). Somewhat lower cor- 
relations were obtained for the ratings made 
by the same raters using different techniques 
(r = .57, .52). The lowest correlations were 
obtained for the sets of ratings made on the 
same ratees by different raters using the same 
techniques (r = .30, .25) and by different 
raters using different techniques (r = .29, .26, 
.24, .23). 

It was to be expected, of course, that 
evaluations of the same ratees by the same 
raters rendered 8 days apart would be in sub- 
stantial agreement. It was also to be ex- 
pected that agreement would be less when 
the evaluations were made by different raters 
on the two occasions. However, the signifi- 
cant facts to be noted are these: (a) when 
the same raters were involved, agreement was 
greater when the same technique was em- 
ployed than when different techniques were 
employed; and (b) when different raters were 
involved it made no difference whether the 
same or different techniques were employed. 

It appears, therefore, that contamination 





102 


in this study was linked to the raters. Con- 
tamination resulting from similarity of tech- 
nique appeared only when the raters were 
identical and was virtually absent when the 
raters were different. 

In evaluating these findings, the following 
limitations of the study should be borne in 
mind: (a) the design of the study did not 
permit the use of all combinations of raters 
and technique; (b) it did not permit varying 
degrees of overlap in the two sets of raters or 
the time interval between ratings; (c) only 
a limited variety of techniques were studied; 
(d) the rater population may not have been 


Gloria H. Falk and A. G. Bayroff 


typical of Army raters. Nevertheless the 
findings were internally consistent and the 
following generalization stated in terms of 
criterion contamination may be offered: in 
validating rating instruments against cri- 
terion ratings, rater contamination is more 
serious than is technique contamination. If 
the raters who provide the predictor ratings 
are different from those who provide the cri- 
terion ratings, no technique contamination 
will result. If, however, the raters are the 
same, technique contamination may appear if 
the same.techniques are used. 


Received April 27, 1953. 





Tue Journat or Appiiep Psycno.ocy 
Vol. 38, No. 2, 1954 


Validity versus Reliability 


Edward K. Strong, Jr. 


Stanford University 


Which is preferable—a test with higher va- 
lidity and lower reliability or a test with 
lower validity and higher reliability? 

Two recent investigations have raised this 
issue. Neither is sure but both incline to- 
ward giving greater weight to validity than 
reliability. Clark (1) says, “The evidence 
thus far presented makes any decision about 
types of keys to be used rather difficult, since 
improved separation of groups must be 
weighed against decreased test retest reli- 
abilities. The alternative conclusion, that 
low reliability has little meaning in a situa- 
tion where high validity is obtained, is of 
course a possibility. Had the estimate of re- 
liability been other than a test retest meas- 
ure, this would have been an attractive al- 
ternative.” 

Strong and Tucker (4) report: “BS scales 
have been selected over BO scales on the ba- 
sis of greater validity. It is believed that va- 
lidity is more important than reliability, that 
validity automatically necessitates reliability, 
and that the measures of internal consistency 
reported herein are not complete measures of 
reliability.” 

Both of these investigations have been con- 
cerned with the selection of items for a key to 
differentiate two groups on the basis of their 
interests. Clark’s aim was to select items, so 
that different aspects of differences in inter- 
ests would be represented by the same num- 
ber of items in each case. He does not claim 
to have achieved this desideratum nor.to have 


developed a satisfactory method of doing so, 
but that such was his aim and some real 
progress was achieved. 

Strong and Tucker developed keys to dif- 
ferentiate medical specialists from medical 
men-in-general. They found that the origi- 
nal internist medical specialist key did not 
differentiate internists from psychiatrists to 
any marked degree. Items were then se- 
lected that would differentiate not only in- 
ternists from medical men-in-general, but 
also internists from surgeons, pathologists and 
psychiatrists. The original scales and the 
revised scales’ were designated, respectively, 
BO and BS scales. 

A small sample from each of these investi- 
gations is given here to illustrate the prob- 
lem. Table 1 indicates that the iterative and 
Gulliksen types of key are superior to the 
original, i.e., customary types of key, as far 
as validity is concerned but have appreciably 
lower reliability. Table 2 illustrates the same 
situation from data of Strong and Tucker. 
Differences were not great for three pairs of 
scales but were in the indicated direction. 
The really serious problem concerned the BO 
and BS internist scales. Here the BO scale 
of 278 items had a reliability of .86 in com- 
parison with the BS scale of 69 items and re- 
liability of .69. But the BO scale had a va- 
lidity of 69 per cent overlap, biserial r of .47 
in comparison with the BS scale which had a 
51 per cent overlap and biserial r of .68. 

It is to be noted that in both investigations 


Table 1 


Separation of Aviation Machinists Mates from Navy Men-in-General 


Number 
of Items 


Type of 
Key 


Original 83 
Iterative 42 
Gulliksen 49 


Per Cent Overlap 


Cross 
Validation 


Test-Retest 


Original Reliability 


65 58 85 
51 51 74 
56 51 75 





Edward K. Strong, Jr. 


Table 2 


Validity and Reliability of BO and BS Scales (Strong and Tucker) 


Average 
Number 
of Items 


Type of 
Scale 


4 BO Scales 285 
4 BS Scales 161 


a decrease in number of items resulted in de- 
crease in reliability, although the retained 
items had individually greater reliability and 
distinctly greater validity. 

The data in Table 3 have just been tabu- 
lated. Their import raises anew the ques- 
tion whether increased validity can offset de- 
creased reliability. The odd-even reliabilities 


Table 3 


Odd-Even Reliability of Scales and Test-Retest 
Correlations over an Average of 18 Years 
Test-Retest 
Correlation 
N = 663 


Odd-Even 
Scales Reliability 
Engineer 94 .79* 
Life Insurance 93 75 
Chemist 91 79 
Sales Manager 90 68 
Real Estate 90 
Doctor 87 
Farmer 8&8 
Lawyer 88 73 
Office 88 65 
Production Manager 85 67 
Accountant 84 65 
Banker 83 we 
President 82 50 
Personnel 82 54 
Public Administrator 76 48** 


* Previously reported as .76, based on 203 cases (3). 


**N = 248 


Validity 
Per Cent 
of Total Biserial 
Overlap r 


Reliability 
(odd-even ) 


52 67 85 
42 17 79 


of 15 scales for the Vocational Interest Blank 
(2, p. 78) are given in the table and the cor- 
responding test retest correlations for 663 
former college students retested on the aver- 
age 18 years later. The eight scales with 
high reliability (.88 to .94) have an average 
test retest correlation of .73 in contrast to the 
correlation of .60 for the seven scales with 
poorer reliability. The rank order correla- 
tions between odd-even reliability and test 
retest correlation is .83. 

The conclusion from these data is that if 
one wishes to have test scores as a basis to 
predict behavior. in the distant future he 
wants tests that will give as great agreement 
as possible between scores today and scores 
in the future and that the reliability of the 
scale is important in this connection. 


Received May 19, 1953. 


References 


1, Clark, K. E. Research on scoring methods for 
the U. S. Navy Vocational Interest Inventory. 
Technical Report No. 5, 1952, Department of 
Psychology, University of Minnesota. 

2. Strong, E. K., Jr. Vocational interests of men 
and women. Stanford: Stanford University 
Press, 1943. 

3. Strong, E. K., Jr. Nineteen-year follow-up of 
engineer interests. J. appl. Psychol., 1952, 
36, 65-74. 

. Strong, E. K., Jr. and Tucker, A. C. The use of 
vocational interest scales in planning a medi 
cal career. Psychol. Monogr., 1952, 66, No. 9 





Tue Journat or Apriiep Psycnoiocy 
Vol. 38, No. 2, 1954 


Sampling Problems in Studies of Writing Style 


Richard D. Powers 


Department of Agricultural Journalism, University of Wisconsin 


The past few years have seen a growing 
number of studies in which the constituent 
elements of writing style are examined indi- 
vidually and_ statistically. The work of 
Rudolph Flesch (5) and of Edgar Dale (2) 
in “readability scoring” is perhaps best 
known, but this paper is primarily concerned 
with lesser known developments (8, 10) and 
with the future of stylistic measuring devices. 
Formulas for readability are only a secondary 
consideration. 

With most style studies has come the neces- 
sity for sampling. Application of stylistic 
analysis to the total content of a book or 
newspaper may be unnecessary, and in many 
cases is usually physically impossible. Sam- 
ples are the alternative. 

Sampling theory gives a basis for estimat- 
ing the minimum size of sample needed for 
varying degrees of precision. However, the 
most useful tests can be applied only to 
random samples; in other words, to samples 
where each unit of the kind being studied 
had an equal chance of being drawn in the 
sample. 

This brings us to a special problem of 
sampling in writing style. Usually two kinds 
of units are involved in an analysis of style: 


1. Sentence characteristics—length, form, 
structure. 

. Word characteristics, and principally the 
relative difficulty of various words. 


Since, in the most popular style-measuring 
devices, sampling concerns both of these units, 
almost all sampling has been drawn by draw- 
ing a sample of sentences and then analyzing 
these sentences and the words in them as the 
particular formula requires. If, in future 
studies concerning only word measurements, 
the assumption is made that this procedure 
gives a random sample of words, the results 
could be misleading, depending upon the type 
of stylistic measurement. 


It is apparent that a 100-word sample 
drawn by using all the words in five randomly 
selected sentences does not fulfill the condi- 
tions of random selection of 100 words. And 
Baker (1) has presented evidence that selec- 
tion of 100 consecutive words in paragraphs 
could bias results of certain kinds of studies. 
Sampling by sentences in studies of word 
characteristics is actually a type of cluster 
sampling. It involves a selection of several 
connected units of analysis at one drawing. 
As such, it is a restriction of randomness 
which could cause logical and statistical diffi- 
culties in this type of study. 

For such a clustered sample, it is an error 

sigma 


to apply the formula SE for deter- 


mining the precision of the sample. Of course, 
there are ways to evaluate the precision of a 
clustered sample. But to do that, we must 
have additional information, such as_ the 
amount and direction of intercorrelations be- 
tween the elements within the clusters. Data 
on the intercorrelations between words in sen- 
tences are necessarily very meager, so we have 
no way of knowing how we affect the pre- 
cision of our samples by clustering. 

Some relationships between words in a sen- 
tence are obvious, though. For instance, 
when “the” appears in a sentence, it is usu- 
ally followed by an adjective or a noun, some 
times by an adverb, but almost never by a 
verb. And, up to a point, the more words 
preceding a given word in a sentence, the 
more the nature of that word is predeter- 
mined. 

Because of this verbal contextual effect— 
one word increasing or decreasing the prob- 
ability that the words which follow will be 
certain words or types of words—it would 
seem that some kinds of words might be over- 
represented when samples of sentences are 
drawn for studies of word characteristics. 
Likewise, it would seem that some kinds of 
words might be under-represented. 





106 


Simple random sampling could safeguard 
against these difficulties. Theoretically, more- 
over, a study of sentence length must be done 
by drawing a random sample of sentences; a 
study of word lengths or proportions of parts 
of speech calls for a random sample of words; 
and, theoretically again, a. study of clause 
structure must be made on a random sample 
of clauses. But we compromise between pure 


theory and practical considerations almost 
every day, with no drastic results and often 
with considerable economy. 


Procedure 


This study, then was an empirical attempt 
to see how “cluster” sampling (sampling by 
sentences) affected certain arbitrarily se- 
lected word variables: (a) representation of 
different parts of speech in the sample; (b) 
proportion of “hard” words (defined as words 
not in Edgar Dale’s list of 3,000 words); (c) 


proportions of words of different syllable. 


lengths (words of one or two syllabies were 
called “short” words for the purposes of this 
study); and (d) proportions of “structural” 
words (defined as prepositions, conjunctions, 
and articles—an exclusive, though not inclu- 
sive category ). 

The two samples were drawn by equally 
random methods. The sample of 1,000 words 
was picked one word at a time by a table of 
random numbers. The sample of 64 sen- 
tences was also picked by a table of random 
numbers (997 words). 

For the word sample, the first four num- 
bers of the random number table designated 
the page number, the next number designated 
the paragraph number on that page or the 
following one, and the next two numbers 
indicated the word in the selected paragraph 
(or the following one) which was to be drawn 
into the sample. The sentence sample was 
selected using the first four random numbers 
for the page number and the next two num- 
bers for the sentence number en that page or 
the following one. The samples were drawn 
from a three-volume report the U. S. Depart- 
ment of Agriculture prepared for a congres- 
sional committee (12). 

This particular report was used for sam- 
pling because the first aim of the study had 


Richard D. Powers 


been to establish grammatical and vocabulary 
differences between a “popularized” version 
and a rather technical version of the same 
material. Both kinds of writing were con- 
tained in the report. The difficulty in deter- 
mining sample size led to the present study, 
the original intent having been dropped. 

A sample of 1,000 words was the size se- 
lected because, on assumptions of simple ran- 
dom sampling, this size assures accuracy in 
word studies to within three per cent of the 
sample value 95 per cent of the time when 
allowing for maximum variability. Most of 
the time, the precision of such a sample 
would approach two per cent for measure- 
ments of parts of speech. A rough analysis 
of the accumulated percentages by 50 word 
subsamples shows that, in general, the sen- 
tence sample was more stable than the word 
sample. That is, the curve levelled off at an 
earlier point than for the word sample. 

The significance of the difference between 
the two samples for each measure was estab- 
lished by a “t” score—the difference be- 
tween the two, divided by the standard error 
of the difference. 


Results 


Table 1 shows that for measurement of the 
proportion of different parts of speech draw- 
ing the sample by sentences didn’t signifi- 
cantly affect the results. (A word was classi- 
fied as a particular part of speech on the 
basis of traditional grammatical rules, with 
the exception that nouns modifying other 
nouns modifying other nouns were classified 
as adjectives. ) 

However, as shown in Table 2, the results 
were significantly different for the measure- 
ments of proportions of “short” words, “easy” 
words, and “structural” words. These cate- 
gories are more rigidly defined and logical 
reasoning would tell us are more meaningful 
in measuring such style aspects as the read- 
ing ease of writing. No significant differences 
were obtained when the proportion of nouns, 
adjectives, and verbs were lumped together 
in a manner similar to that of forming the 
‘structural words” category. Since the struc- 
tural words are generally short words that 
are included in the Dale 3,000 word list, 





Sampling Problems in Studies of Writing Style 


Table 1 


Proportions of the Various Parts of Speech in a Sample Drawn by Selecting all Words 
from Randomly-drawn Sentences 


Sample Drawn 


Sample Drawn 
by Words 


by Sentences Ditierence 


%  N % oN 
Nouns 31 p 31 
Adjectives 20 202 17 
Verbs 14 i4 
Adverbs 1 11 1 


Prepositions 18 
Articles 
Conjunctions 


08 
47 


Pronouns 36 
* Score of 1.96 statistically significant. 


these latter three factors are probably inter- 
related. 

Two hypotheses suggested by this part of 
the study are that the measure of average 
syllable length per word is a less sensitive 
gauge of word difficulty than the measure of 
proportions of words of different syllable 
lengths, or that it may require huge samples 
to reveal smaller shades of difficulty. This 
might suggest that Gunning’s Fog Index (7) 
is a readability measure based at least in part 
on sounder premises than Flesch’s readability 
formula, or that the Flesch formula may re- 
quire larger samples than we have thought. 

It might also merit study in light of the 
recent controversy between Farr, Jenkins, and 
Paterson (3, 4) vs. Klare (9) and Flesch (6) 
as to whether or not a count of one syllable 
words would suffice to indicate vocabulary 


difficulty. Though the actua. behavior and 
evaluation of these two measurements is a 
subject for further detailed study, Table 3 
shows the proportions of words of various 
syllable lengths obtained in this study: 


Discussion 


In applying the findings of this study to 
the field of style analysis, it can safely sup- 
ply only a caution: that as measurements of 
style variables become more refined, the sam- 
pling methods may also have to become more 
refined. In other words, as style analysis 
goes from rather crude and vaguely-defined 
measurements such as proportions of parts of 
speech proposed by Stormzand and O’Shea 
(11) to the ‘‘psychogrammatical” categories 
proposed by Sanford (10) or to other refined 
ratios and categories (8), sampling by the 


Table 2 


Proportions of “Short”? Words, “Hard” Words, and ‘Structural’? Words in a Sample Drawn Randomly by 
Words and a Sample Drawn by Selecting All Words in Randomly-drawn Sentences 


Sample Drawn 
by Words 


N 


Short Words 
Hard Words 
Structural Words 
Syll. per Word 


1.89 


* Score of 1.96 statistically significant. 


Sample Drawn 
by Sentences 


Ditierence 


N % N 


Oo 


708 4 


‘ 


> 





Richard D. Powers 


Table 3 


Proportions of Words of Various Syllable Lengths in a Sample Drawn Randomly by Words and a 


Sample Drawn by Selecting All Words from Randomly-drawn Sentences 


“Short” Words 
1 syll 2 syll. 


, o7 : or 
N Zo 


Word Sample 509 2 
Sentence Sample 5455: 223 2 


traditional method of drawing words in sen- 
tences and in paragraphs should be examined 
carefully to see that the clustering of units 
does not adversely affect the results to such 
a degree that the efficiency of sampling was 
a false economy. 


Summary 


As our techniques of studying language be- 
come more refined, we need to take a closer 
look at our sampling methods. 

The samples usually drawn for language 
studies are made up of clusters of words, in 
sentences or paragraphs. In some studies, 
the words so collected have been subjected to 
further analysis. 

Such a sample is a “clustered” sample, and 
a network of unknown intercorrelations be- 
tween the words interferes with the known 
probability that any unit of the universe will 
be drawn in the sample. Thus, for some 
word characteristics at least, sampling by sen- 
tences would bias the sample of words in an 
undetermined direction. 

Random sampling of words is suggested as 
a way to sidestep such difficulties in word 
studies. A comparison of the two sampling 
methods (clustered and simple random) indi- 
cates that a clustered sample significantly 
overestimated the percentage of “short” 
words, “structural” words, and “easy” words. 
It is suggested that the structure of the sen- 
tence (the need to have many short and easy 





“Long” Words 
4 syll. 5 syll. 


y O7 y oF 
N % N % 


10 


‘ 


connective words) has imposed an orderli- 
ness that has biased the clustered sample. 


Received May 29, 1953. 


References 


. Baker, S. J. A linguistic law of constancy. II. 
J. gen. Psychol., 1951, 44, 113-120. 

. Dale, E. and Chall, J. S. A formula for pre- 
dicting readability. Ed. Res. Bull. Ohio St 
Univ., 1948, 27, 37-54 

Farr, J. N., Jenkins, J. J. and Paterson, D. G 
Simplification of the Flesch reading ease 
formula. J. appl. Psychol., 1951, 35, 333-337. 

. Farr, J. N., Jenkins, J. J., Paterson, D. G. and 
England, G. W. Reply to Klare and Flesch 
on “Simplification of reading ease formula.” 
J. appl. Psychol., 1952, 36, 55-57. 

. Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-223. 

. Flesch, R. Reply to “Simplification of Flesch 
reading ease formula.” J. appl. Psychol., 
1952, 36, 54-55. 

. Gunning, R. D. The technique of clear writing 
New York: McGraw-Hill Co., 1952. 

Johnson, W. People in quandaries. New York: 
Harper, 1946. 

. Klare, G. R. A note on “Simplification of the 
Flesch reading ease formula.” J. appl. Psy- 
chol., 1952, 36, 53. 

. Sanford, F. H. Speech and_ personality. 
chol. Bull., 1942, 39, 811-845. 

. Stormzand, M. J. and O’Shea, M. How much 
English grammar? Baltimore: Warwick and 
York, 1924. 

12. U. S. Government Printing Office Report of Re- 
search and Related Activities of the U. S. De- 
partment of Agriculture, Washington, D. C., 
1951. 


Psy- 





Tux Journat or Appiiep Psycuo1ocy 
Vol. 38, No. 2, 1954 


Differential Prediction of Academic Success at Brigham Young 
University ' 


Joics B. Stone 


Brigham Young University 


The research on prognosis of college aca- 
demic success (1, 2, 3, 4) has focused on 
three phases of the problem: (a) prediction 
of general scholarship; (b) prediction of 
scholarship in specific subjects or subject 
groups; and (c) differential prediction in ma- 
jor areas or curricula. The most effective 
predictor variables have proved to be high 
school grade-point average, some measure of 
scholastic aptitude, and an objective measure 
of high school achievement. Multiple cor- 
relations have proved more efficient, gener- 
ally, than zero-order correlations. 

The present study represents an attempt to 
provide multiple regression equations which 
can be used in the differential prediction of 
academic success in four college curricula at 
Brigham Young University: (a) commerce; 
(b) elementary education; (c) physical sci- 
ences; and (d) social sciences. 


Plan of the Study 


Curriculum Components. The four cur- 
ricula studied included the following aca- 
demic departments: 


1. Commerce: accounting, business administra- 
tion, finance and banking, and marketing. 

2. Elementary education. 

3. Physical sciences: chemistry, geology, mathe- 
matics, and physics. 

4. Social sciences: history, political science, and 
sociology. 

Criterion. The criterion was selected to con- 
form to those curricula. The curriculum grade- 
point average (CGPA) was selected as the meas- 
ure of the criterion. Only courses essential to 
each curriculum were used in computing the 
CGPA. A minimum of thirty curriculum credit- 
hours was required for each student. This mini- 
mum constituted from one-half to two-thirds of 
the departmental major requirement for gradua- 
tion. The reliability of the criterion was checked 


' This paper is a portion of a dissertation pre- 
sented as partial fulfillment of the requirements of 
the degree of doctor of philosophy at the University 
of Utah. The writer is particularly indebted to Dr. 
F. B. Jex and Dr. R. D. Willey, who served vari- 
ously as chairman of the dissertation committee. 


by correlating the CGPA of the first 10 hours of 
curriculum credit against the total CGPA. For 
the respective criteria the reliability coefficients 
were: commerce, .78; elementary education, .82; 
physical sciences, .79; and social sciences, .68. 

Students. The commerce curriculum included 
102 students; 123 were in clementary education; 
133 in the physical sciences; and 78 in the social 
sciences. Except for the elementary education 
group, there was a predominance of male stu- 
dents in each group 

Predictor Variables. The total high school 
grade-point average (HSGPA) and two tests 
were used. The tests were part of the entrance 
battery of this university. The 1949 editions of 
the American Council on Education Psychologi- 
cal Examination (ACE) and the Cooperative 
General Culture Test (CGCT) were used. Sub- 
test scores and total scores were tabulated. The 
Wherry-Doolittle method of test battery selec- 
tion was used. 


Results 


The most efficient single predictor of cur- 
riculum success was the HSGPA. In com- 
bination with the ACE Total score, it sup- 
plied the most efficient batteries, with an ad- 
ditional factor in the social science curriculum 
and two in the physical sciences. 

The multiple correlations for the most effi- 
cient battery and each curriculum are shown 
in Table 1. Also shown are the respective 
Index of Forecasting Efficiency (7), the Co- 
efficient of Determination (R*), and the 
Standard Error of R. 

The most efficient battery for predicting 
academic success in the commerce curriculum 
included the HSGPA and the ACE Total 
score. This battery accounted for 40.1 per 
cent of the criterion variance, compared to 
35 per cent for the best single predictor 
(HSGPA). 

These two factors, HSGPA and ACE Total, 
also comprised the most efficient battery for 
predicting success in the elementary educa- 
tion curriculum. ‘This battery accounted for 
53.4 per cent of the criterion variance, com- 
pared to 45 per cent for HSGPA, alone. 


109 





Joics B. Stone 


Table 1 


Multiple Correlations of Certain Predictor Variables and Success in Four Curricula at 
Brigham Young University 


Curriculum 
Commerce 


Elementary Education 
Physical Science 


Battery R 


SEr 


.060 


HSGPA & A.C.E. Total 731 042 
HSGPA, A.C.E. Total , 040 


CGCT Literature & 
General Science 


Social Science 


HSGPA, A.C.E. Total 


CGCT General Science 


The factors, HSGPA and ACE Total, were 
supplemented by the CGCT Literature and 
General Science sub-tests in providing the 
most efficient battery for predicting success 
in the physical sciences. This battery ac- 
counted for 53.7 per cent of the criterion 
variance, compared to 33 per cent for 
HSGPA, alone. 

The factor, CGCT Literature, dropped out 
of the above battery in providing the most 
efficient battery for predicting success in the 
social sciences, leaving the HSGPA, ACE 
Total, and CGCT General Science. This 
battery accounted for 26 per cent of the cri- 
terion variance, compared to 18 per cent for 
the ACE Linguistic score. It should be noted 
that the criterion reliability for this cur- 
riculum was substantially .ower than that of 
the other curricula. 

Multiple regression equations and conver- 
sion tables were prepared for each of the 
above batteries. It is possible for a counselor 
at Brigham Young University to take the 
student’s HSGPA, ACE Total score, and 
CGCT Literature and General Science scores, 
and determine the predicted grade-point av- 
erage (PGPA) for that student in any one 
or all of the four curricula studied. 


Summary 
1. The utilization of entrance test data 
and high school grade-point average pro- 
vides the counselor at Brigham Young Uni- 


versity with the basis for making differential 
predictions of academic success in four cur- 
ricula. 

2. For commerce and elementary educa- 
tion, the most efficient battery included the 
HSGPA and ACE Total scores. The respec- 
tive R’s were .633 and .731. 

3. The physical sciences criterion was best 
predicted by a battery including the HSGPA, 
ACE Total, and CGCT Literature and Gen- 
eral Science. R for this battery was .733. 

4. The social science predictor battery in- 
cluded the HSGPA, ACE Total and CGCT 
General Science. R was .507.° 

5. The best single predictor 
HSGPA. 

6. The reliability coefficients of the cri- 
terion measure (CGPA) clustered around .80 
except for the social science curriculum with 
an r of .68. 


was the 


Received June 2, 1953. 


References 


. Crawford, A. B. and Burnham, P. S. Forecasting 
college achievement. New Haven: Yale Uni- 
versity Press, 1947. 

2. Monroe, W. S. (editor). Encyclopedia of educa- 
tional research. New York: The Macmillan 
Company, 1950. Pp. 882-886. 

3. Wallace, W. C. Differential predictive value of 
the A.C.E. Psychological Examination. Sch. 
& Soc., 1949, 70, 23-25. 

Wolf, R. R., Jr. Differential forecasts of achieve- 
ment and their use in educational counseling. 
Psychol. Monogr., 1939, 51, 1-53. 





THe Journat or Appiicp Psycno.ocy 
Vol. 38, No. 2, 1954 


Performance of College Students on a Mechanical Knowledge 
Test 


Benjamin Balinsky .and Charles Hujsa 


City College of New York 


When given the SRA Mechanical Aptitude 
Test as part of a course in Vocational Psy- 
chology, students commented that they did 
not do as well on the Mechanical Knowledge 
subtest as on the Space Relations and Shop 
Arithmetic. In order to test the comment, 
the SRA Mechanical Aptitude Test results of 
112 male students were tabulated. All stu- 
dents were in either the junior or senior year 
of the School of Business of the City Col- 


lege of New York and between the ages of 
19 and 22. 

The Revised Minnesota Paper Form Board 
Test was available on the 112 students, and 
the Location, Blocks, and Pursuit subtest 
scores of the MacQuarrie Test for Mechani- 
cal Ability on 50 of the 112 students. These 
tests were also included in the study. 

Test intercorrelations were calculated for 
all combinations of tests. Tests of signifi- 


Table 1 


Test Intercorrelations for College Students tf 


Mech. 
Knowl. 


Space 
Rel. 


Shop 
Arith. 


Mech. Knowl. 09 07 
Space Rel. 18* 
Shop Arith. 

Total SRA 

Rev. Minn. P.F.B 

Location 

Blocks 

Pursuit 


Rev. 
Minn. 
P.F.B. 


.19* 
a" 
jd 
ae" 


r (Total SRA Minus Mech. Knowl. X Mech. Knowl.) = .12 
r (Total SRA Minus Space Rel. X Space Rel.) = .18* 
r (Total SRA Minus Shop Arith. X Shop Arith.) = .15 


+ The correlations with the Location, Blocks and Pursuit subtests are based on 50 subjects; others on 112 
* Significant at the 5% level. 
** Significant at the 1% level. 


Table 2 


Means, Standard Deviations and Tests of Significance for College and SRA Male ‘Trainee Groups 


College Group SRA Male Trainees 


Mean S.D. Mean S.D. 


Mech. Knowl. 
Space Relations 
Shop Arith. 
Total 


25.0 6.3 31.8 
20.4 4.1 19.0 
14.4 3.7 98 
59.6 60.5 





112 


cance were computed for the difference be- 
tween the means of the test scores of the col- 
lege students and the norm group of male 
trainees in the SRA Mechanical Aptitude 
Test. These data are presented in Tables 1 
and 2. 

Incidentally, the mean Mechanical Knowl- 
edge score of the students is at the 19th per- 
centile of the male trainee norms, the mean 
Space Relations at the 55th percentile, and 


Benjamin Balinsky and Charles Hujsa 


the mean Shop Arithmetic at the 87th per- 
centile. The mean score of the college stu- 
dents on the Revised Minnesota Paper Form 
Board is at the 70th percentile of the 
machine and electrical apprentice applicants 
norms and the difference between the means 
of both groups is significant at < .001 in 
favor of the students. 


Received June 19, 1953. 





THe Journat or Appiizp PsycHo.ocy 
Vol. 38, No. 2, 1954 


Relation of Scholastic Aptitude to Socioeconomic Status and to 
a Rural-to-Urban Continuum 


Norman F. Washburne and Dean C. Andrew 


Southern State College, Magnolia, Arkansas 


Scholastic aptitude tests play an impor- 
tant role in modern education. They are used 
in schools for the purpose of aiding students 
in selecting courses and vocations, and have 
a variety of other uses in guidance, counsel- 
ing, and various aspects of research. At 
Southern State College it is the practice to 
administer the college level ACE Psychologi- 
cal Examination to all entering freshmen. 
Where such a practice prevails, it is desirable 
to know if the test measures what it is in- 
tended to measure, and what factors, if any, 
might bias the results of the test. 

The student body of Southern State Col- 
lege is composed primarily of residents of 
southern Arkansas, northern Louisiana, and 
eastern Texas. It is a regionally homogene- 
ous population. There are no significant im- 


migrant groups, and the undergraduate stu- 
dents are all Caucasians. However, the popu- 


lation is quite varied in two respects: (1) the 
socioeconomic status of the individual stu- 
dent; and (2) the sizes of the communities 
in which the students have grown up. The 
question therefore arises, do these variations 
in socioeconomic status and degree of urbani- 
zation affect the scholastic aptitude of the 
students as measured by the ACE Psycho- 
logical Examination? 

In order to test this question, a sample of 
100 students was drawn at random from those 
who had been enrolled in April 1952.’ These 
students had been given the ACE and had 
also filled out a questionnaire which made it 
possible to determine their socioeconomic 
statuses and the relative degrees of urbani- 
zation of their residence histories. Coeffi- 
cients of total and partial correlation were 
computed in order to determine the relation- 
ships among the three variables. 

The Scholastic Aptitude Test. The ACE 


! This sample size approximates one-sixth of the 
undergraduate student body at the time. 


Psychological Examination? is an instrument 
designed to measure the scholastic aptitude of 
American college freshmen. Its scoring yields 
three measures: the Q score; the L score; and 
the total score. The Q score is a measure of 
the respondent’s ability to solve problems of 
quantitative nature. The range of the Q 
scores of our sample was from 9 to 60. The 
L score is the measure of the respondent’s 
ability to solve problems of a linguistic na- 
ture. The range of the L scores of our sam- 
ple was from 27 to 92. The total score is a 
sum of the Q and L scores and is a measure 
of the total scholastic aptitude of the respond- 
ent. The range of the total scores of our 
sample was from 36 to 142. 

The Socioeconomic Status Scale. The so- 
cioeconomic status scale is an instrument de- 
signed to quantify the social and economic 
position of the college student.’ Its scores 
are based upon the occupation of the stu- 
dent’s father, and upon the educational at- 
tainments of both of the student’s parents. 
The occupational and educational factors are 
weighted equally. The scale differs from 
some other socioeconomic status scales used 
in similar studies in two ways: (1) its oc- 
cupational factor is not arbitrarily scored, 
but rather is based upon the students’ own 
evaluation of occupations representative of 
those of their fathers; and (2) it does not 


assume that social classes exist as discrete 


“American Council on Education Psychological 
Examination, 1948 College Edition, Educational 
Testing Service, Cooperative Test Division (Prince 
ton, New Jersey), 1951 

’ Details of the construction and validation of the 
Socioeconomic Status Scale and the Residence His 
tory Scale are to be found in Norman F. Wash 
burne, “Urban” Attitudes and Responses as Related 
to Residence in Urban Communities and to Socio- 
economic Status, Ph.D. Dissertation, Washington 
University, St. Louis, Missouri, 1953. This work 
has also been published in mimeographed form as an 
Institutional Study of Southern State College, Mag 
nolia, Arkansas, and a limited number of copies are 
available on request 


113 





N. F. Washburne and D. C. Andrew 


Table 1 


Total and Partia) Correlation of Scholastic Aptitude Scores with Socioeconomic Status Scores, 
100 Southern State College Students 





Probability 
of Null 
Hypothesis 


Coefficient of 
Total Correlation 
P > 05 
P > .05 
P > .05 


024 
123 
.166 


Q score 
L score 
Total score 


r of Socioeconomic Status vs. Residence History = + .19. 


cultural units, but rather assumes a con- 
tinuum of socioeconomic statuses. The points 
of the scale are handled statistically as if 
they were the midpoints of intervals along 
the continuum. The theoretical as well as 
the actual range of the socioeconomic status 
scale is from 2 to 10. 

To clarify the meaning of the scale the fol- 
lowing examples are offered: The father of 
one student scoring 10 on the socioeconomic 
status scale is an owner of a large manufac- 
turing plant. Both of the student’s parents 
had gone to college and one of them had 
taken graduate work beyond the baccalaure- 
ate degree. On the other end of the scale the 
father of one student who scored 2 is a day 
laborer on a farm. Neither of this student’s 
parents had completed the sixth grade in 
school. 

The occupational factor of the scale cor- 
related highly with North and Hatt’s similar 
scheme‘ and so is judged to be valid. Socio- 
economic status scores were computed for a 
sample of 100 students from data gathered 
on two different occasions, and the scale was 
found to be reliable. 

Residence History Scale. The residence 
history scale is an instrument designed to 
quantify the degree of urbanization of the 
backgrounds of individuals. It is, as far as 
we know, the first instrument which goes be- 
yond the simple characterization of students’ 
home-towns as being either rural or urban. 
It is a complicated device which takes into 
account the size, degree of isolation, and 

* Cecil C. North and Paul K. Hatt. Jobs and oc- 


cupations: a popular evaluation. Opinion News, 
September 1, 1947, pp. 3-13. 


Probability 
of Null 
Hypothesis 


Partial Correlation 
with Residence 
History Held Constant 


024 
070 
115 


P> 05 
P> 05 
P > .05 


proximity of larger urban centers of all the 
individual’s places of residence from the time 
he entered the first grade until the present. 
It also takes into account the length of time 
the individual spent in each place of resi- 
dence. The residence history scale assumes 
a rural-to-urban continuum. It has a theo- 
retical range from 0 to 50. A score of 0 
would indicate that the individual has lived 
all his life more than 100 miles away from 
the nearest community of 250 population. 
At the other extreme, a score of 50 would 
indicate that the student has lived all his life 
within 6 miles of a city of a population of at 
least one-half million. The actual range of 
the residence history scores of our sample 
was from 10 to 48. Residence history scores 
were computed for 100 students from data 
gathered on two different occasions, and the 
scale was found to be reliable. 


Results 


The relationships between the scores on the 
scholastic aptitude test and the socioeconomic 
status scores of the sample are presented in 
Table 1. 

It can be seen from Table 1 that all co- 
efficients of total correlation were low and 
statistically not significant. However, since 
the coefficient of total correlation between 
residence history scores and socioeconomic 
status scores was found to be + .19, it seemed 
feasible to seek an understanding of the ef- 
fects of each of the factors upon the scholastic 
aptitude scores when the other was held con- 
stant. Table 1 therefore also presents the co- 
efficients of partial correlation of the test 





Relation of Scholastic Aptitude to Socioeconomic Status 


Table 2 


Total and Partial Correlation of Scholastic Aptitude Scores with Residence History Scores, 


100 Southern State College Students 








Probability 


Coefficient of of Null 


Total Correlation 


245 
302 
308 


P< 01 
P< a 
P< 01 


Q score 
L score 
Total score 


scores with socioeconomic status scores while 
the residence history scores are held constant. 
None of the resulting relationships are shown 
to be significant. 

The relationships between residence history 
scores and the scholastic aptitude scores are 
presented in Table 2. 

All coefficients of total correlation between 
residence history scores and the scholastic 
aptitude scores are shown in Table 2 to be 
significant at the one per cent level. That 
means that the relationship between the fac- 
tors would happen less than one time in a 
hundred by chance. When coefficients of 
partial correlation of the scholastic aptitude 
scores with the residence history scores were 
calculated while socioeconomic status scores 
were held constant, all of the resulting coeffi- 
cients were slightly lower than the total co- 
efficients with the exception of the Q score 
which remained the same. However, even 
these slightly lower coefficients of partial cor- 
relation were found to be significant at the 
one per cent level. All of the relationships 
were in the direction of rural-to-urban, i.e., 
the more urban the background the greater 
the scholastic aptitude. 


Hypothesis 


Partial Correlation 
with Socioeconomic 
Status Held Constant 


Probability 
of Null 
Hypothesis 
.245 P< O01 
.286 P< Ol 
.295 P< Ol 


Summary and Conclusions 


This investigation attempted to discover 
the relationships between scholastic aptitude, 
socioeconomic status, and placement of the 
individual upon a rural-to-urban continuum, 
as these variables applied to Southern State 
College students. The results seem to justify 
the following conclusions: 

1. For this group of college students there 
is no significant relationship between socio- 
economic status and scholastic aptitude as 
measured by the ACE Psychological Exami- 
nation. 

2. There is a significant, though low cor- 
relation between placement of the students’ 
residence history upon a rural-to-urban con- 
tinuum, and their scholastic aptitude as meas- 
ured by the ACE Psychological Examination. 
That is to say that students from more urban 
backgrounds tend to receive higher scores 
than do students from rural backgrounds. 

Because these findings apply only to South- 
ern State College students, it is suggested 
that further research be conducted on stu- 
dents in other schools and in other regions. 
to see if the findings are confirmed. 


Received May 14, 1953. 





Tue Journat 


or Apriien PsycHoLocy 
Vol, 38, No 195 


? 


Further Results on Group Manual Dexterity in Men 


Andrew L. Comrey and Gerald Deskin 


The University of California at Los Angeles 


In a previous experiment,’ 65 pairs of vol- 
unteer male university students were given 
six individual trials on the Purdue Pegboard, 
Assembly Task, and six trials on the Assem- 
bly Task with the two members of each pair 
working together on the same assemblies 
rather than individually on separate boards. 
The members of each pair were divided on 
the basis of the total of the last four indi- 
vidual trials, Assembly Task, into “high” 
and “low” categories. Reliabilities were de- 
termined for “high,” “low,’ and “group” 
performances, using alternate trials and cor- 
recting for doubled length. Correlations of 
the “high” and “low” performances with the 
“group” performance and with each other 
were computed and corrected for attenua- 
tion. The multiple correlation and regression 
weights were obtained for predicting “group” 
performance from “high” and “low” indi- 
vidual performances. The results showed 
that less than half the group performance 
variance could be predicted from a knowl- 
edge of the individual performances, even 
with the effects of errors removed. The level 
of group performance was only slightly more 
dependent on the “low” individual perform- 
ances. For all practical purposes, equal 
weights could have been used for “high” and 
“low” scores in predicting “group” perform- 
ance. 

The present experiment was designed to 
provide a check on the first experiment and 
to determine the effect of an alteration in the 
nature of the individual task on the amount 
of group variance which could be predicted. 
One of the hypotheses offered to account for 
the fact that much of the variance in the 
group performance scores could not be pre- 
dicted from a knowledge of the individual 
performances was that the two tasks might 
have been too unlike each other. Although 


1Comrey, A. L. Group performance in a manual 
dexterity task. J. appl. Psychol., 1953, 37, 207-210. 


116 


the same end product resulted in both indi- 
vidual and group performance, the latter re- 
quired the subjects to alternate the operations 
they performed on successive assemblies. 
The first subject, for example, would place 
a peg in the first hole on his side of the 
board, after which the second subject would 
add a washer and the first subject would 
follow with a collar, and finally the second 
subject would complete the assembly with 
another washer. Instead of repeating this 
operation, however, the subject who finished 
the assembly would begin the next assembly 
by placing a peg in the second hole on his 
side. The first subject would then place on 
the first washer, and so on. 

Since in the individual task, the subject 
performed each assembly just like the previ- 
ous one, he was not confronted with the ad- 
ditional task of altering his set for each sub- 
sequent assembly, as he was required to do 
for the group performance task. It was hy- 
pothesized that this requirement for chang- 
ing set might have introduced additional 
abilities into the task which were not present 
in the individual task, thereby lowering the 
validity of the individual performance scores 
for predicting the group performance scores. 

To test this hypothesis, the experiment was 
repeated using a redesigned individual task 
which required the subjects to make a change 
of set on each assembly like that to be re- 
quired later in the group task. Instead of 
using the standard instructions for the Purdue 
Pegboard, Assembly Task, the subjects were 
instructed to begin each assembly after the 
first one with the same hand used to place 
the final washer on the preceding assembly. 
In this way, the subject was required to make 
alternate assemblies with reversed hand op- 
erations, substituting the left hand for those 
operations previously performed with the 
right hand, and vice versa. 








Group Manual Dexterity in Men 


Table 1 


Summary of Results 


Score M 


High 156 18.0 
(192) (16.5) 


Low 137 17.2 
(173) (16.8) 


Group 186 19.2 
(178) (19.2) 


R= .69 
(.66) 


Results and Discussion 

In every way except those differences al- 
ready mentioned, the experimental procedure 
and treatment of the data were the same as 
for the first experiment,? and therefore will 
not be repeated here. The sample was made 
up entirely of undergraduate men this time, 
whereas about one-third of them were gradu- 
ate students before; the previous 65 pairs of 
subjects was reduced to 47 pairs for this ex- 
periment. The results are summarized in 
Table 1. The figures included in parentheses 
show the results from the first experiment 
while the numbers immediately above them 
are the corresponding values in the present 
research. 

In the first column of Table 1 are listed 
the total score categories, “high,” “low,” and 
“group,” standing, respectively, for those to- 
tal performances already described. The 
means and standard deviations of the three 
sets of scores are given in the second and 
third columns, respectively. These are based 
on the totals of the last four of six trials. 
This procedure was used to obtain greater 
stability. In the fourth column are given the 
split-half reliability estimates for the three 
types of scores. The next three columns of 
Table 1 give the intercorrelations of the to- 
tal score variables, corrected for attenuation 
in both variables. The last column contains 
the beta weights for predicting “group” per- 
formance from “high” and “low” individual 


2Comrey, A. L., op. cit 


Corrected r with 
- Beta 
Weight 


High Low Group 


1.00 50 a 31 
(.52) (.35) 


i) 1.00 49 
52) (.41) 


55 64 
(.56) (59) 


R? = 48 
(.44) 


performances. The multiple correlation, R, 
and R*, are given in the bottom row of the 
table. Both the beta weights and R were 
computed using the corrected correlations. 
The uncorrected correlations for the present 
experiment were: low-high, .46, low-group, 
.53, and high-group, .47. For the previous 
experiment, the corresponding uncorrected 
correlations were .48, .53, and .50. 

An inspection of Table 1 reveals certain 
discrepancies which require some comment. 
In the present experiment, the mean _indi- 
vidual scores were considerably lower than 
for the previous experiment, which was ex- 
pected because the task was more difficult. 
This resulted in a slight increase in variance, 
too, which again could have been expected. 
The present group performance mean was 
only slightly higher, probably due to indi- 
vidual-task practice in changing set, not 
available to performers in the first experi- 
ment. The variances were identical for the 
group task in both experiments, an outcome 
consistent with expectations in that the task 
was exactly the same in both cases. 

The reliabilities compare very favorably in 
the two experiments, except for the group 
performance score. In this case, the present 
figure was lower than the previous one. The 
corrected correlations are close in the two ex- 
periments, although the discrepancies are in 
opposite directions for the low-group and high- 
group correlations, resulting in a more im- 
pressive difference between the beta weights. 





118 


Whereas the beta weights were fairly close in 
the first experiment, the low scores emerged 
in this research with a definite edge for pre- 
dicting group performance, although the dif- 
ference was still short of statistical signifi- 
cance. 

Looking at the comparative multiple cor- 
relation coefficients and their squares, it is 
evident that no startling improvement has 
occurred in the amount of group-perform- 
ance-score variance which can be predicted 
from a knowledge of individual performance 
scores. The proportion of predicted vari- 
ance is still less than half. This figure was 
achieved only through using correlations cor- 
rected for attenuation. The proportion of 
variance practically predictable would be less. 
It is perhaps worth mentioning that the 
multiple R values would have been even 
closer if the “group” score reliability in the 
second experiment had been higher. Since 
the figure obtained may be spuriously low, it 
would be well to consider the gain actually 
achieved with some caution. 

The results do not bear out the hypothesis 
entertained that prediction of group perform- 
ance scores can be increased markedly by 


making the individual task apparently more 
like the group task in the actual operations. 


Andrew L. Comrey and Gerald Deskin 


Two other hypotheses, as yet untested, were 
offered in the previous article to account for 
the additional unpredicted variance. The 
group task may involve some special traits 
introduced by the necessity of cooperating 
with another person and there may be inter- 
action effects among individuals over and 
above stable trait influences. Attempts will 
be made in further work to explore the na- 
ture of this as yet unpredicted variance. 


Summary 


A previously reported experiment was re- 
peated with an altered design to test the for- 
mer results and a hypothesis offered to ac- 
count for the fact that group performance 
scores on a manual dexterity task could only 
be predicted rather imperfectly from knowl- 
edge of individual scores on a similar task. 
The hypothesis was offered that the predic- 
tion might be substantially improved by a 
change in design to make the group and indi- 
vidual tasks more comparable in the char- 
acter of the operations involved. The amount 
of improvement in prediction obtained was 
so slight as to require the rejection of the 
hypothesis. 


Received May 8, 1953. 





Tue Journat or ApPLiep PsycHoLocy 
Vol. 38, No. 2, 1954 


Effects of Fatigue and Anxiety on Certain Psychomotor and 
Visual Functions ' 


Sherman Ross, T. A. Hussman, and T. G. Andrews 


University of Maryland 


This experiment was an attempt to investi- 
gate the degree of behavior decrement pro- 
duced by the experience of fatigue and threat 
of bodily damage occasioned in the competi- 
tive athletic sport of boxing. The dependent 
variables chosen as possible indicators of be- 
havior decrement were: (a) steadiness score; 
(b) body sway score; (c) body sway time 
score; (d) tapping rate; and (e) critical 
flicker frequency. The primary purpose of 
the experiment was to determine whether or 
not performance on each of the five depend- 
ent variables changes significantly as a result 
of intensive muscular exercise (fatigue) or 
the fear of bodily injury (anxiety) or the in- 
teraction of these conditions in the collegiate 
competitive boxing situation. 

There has been some speculation in the 
past regarding the damaging effects on be- 
havior of sustained head blows such as re- 


ceived in continuous training in boxing (13). 
In addition to these interests in boxing, such 
a situation appears to offer a realistic condi- 
tion of systemic fatigue, high motivation, and 
anxiety such as could not be attained under 
the usual conditions of laboratory investiga- 


tions. These characteristics are not unlike 
those which obtain in certain field conditions 
of military operations and combat. In the 
general search for indicators of behavior dec- 
rement for military purposes, the use was 
made of boxing behavior to approximate these 
characteristics of military importance. 

The basis for the selection of the indicators 
used in this investigation is described below 
for each of the five dependent variables to- 
gether with a description of the manner of 
testing. 


1 This experiment is one of a series of studies on 
behavior decrement performed under Contract No. 
DA-49-007-MD-222 between the Medical Research 
and Development Board, Office of The Surgeon Gen- 
eral, Department of the Army and the University of 
Maryland. The opinions and assertions expressed in 
this report do not necessarily reflect the views of the 
Department of the Army. 


Tests and Indicators Used 

Steadiness has been demonstrated to show 
changes under certain conditions of stress, and it 
has been reported to change with fatigue or work 
output (1, 4, 5, 18). Hand steadiness and 
tremor have also been related to emotional stimu- 
lation (6, 7) and to certain conditions of motiva- 
tion (4). Because of these features, a test of 
hand steadiness was included among the depend- 
ent variables. For this test a target hole in a 
vertically adjustable metal plate was used. The 
subject’s task was to keep a 0.02 inch diameter 
stylus inserted into the 0.136 inch hole for 20 
seconds with the arm fully extended and unsup- 
ported. The number of contacts with the edge 
of the hole during this period served as the score. 

Body sway measurements have offered rather 
controversial results in the past when related to 
fatigue (11, 18) and to loss of sleep (5, 15) 
Because of the possible effects of head blows 
sustained in boxing, measures of body sway were 
obtained. For this purpose an arrangement simi- 
lar to that for steadiness was used. However, 
in this case the stylus was longer and the hole 
diameter was 0.358 inch. The subject was re- 
quired to hold the stylus in the hole, but in this 
case withcut the aid of visual cues. When con- 
tact was made with the edge of the hole, a buzzer 
was automatically sounded as a signal to the sub- 
ject. Two scores were derived from this test: 
a body sway score of the number of contacts 
made in the 20 second period, and a body sway 
time score consisting of the total amount of 
time in seconds the stylus was in contact with 
the edge of the hole during the observation pe- 
riod. These were treated as separate scores in 
the analysis of the data. 

Tapping tests serve as measures of rather sim- 
ple performance, but have been considered by 
some investigators as useful indices of fatigue 
(15, 16). Tapping has been shown to be re- 
lated to the decrement produced by high altitude 
(9). The tapping test apparatus used here con- 
sisted of the Dunlap modification of the Whipple 
Tapping Board (3) and a 0.20 inch diameter 
stylus. The tapping targets were two 3 inch 
square brass plates separated by 1 inch of bake- 
lite. The subject was to tap alternately on the 
plates as rapidly as possible for a period of 15 
seconds. The score used was the total number 
of taps on the plates in this allotted time. This 
brief time period was used as an attempt to di- 
minish the factor of learning, which has been 
shown to affect tapping scores (17). 


119 





120 


Critical Flicker Frequency has been used in 
several investigations on fatigue with contro- 
versial results (12, 14, 18). There has been 
some indication that CFF changes when the in- 
dividual is subjected to intensive strain (2). 
The apparatus used in the present study was the 
Krasno-Ivy Flicker Photometer (8), which is es- 
sentially an episcotister arrangement delivering 
square wave flashes of light on a 34 inch ground 
glass screen. The subject was seated 5 feet from 
the stimulus screen. A modified method of lim- 
its was used, in which the experimenter manipu- 
lated the stimulus from “fusion to flicker” and 
the subject responded at his threshold. Six 
“descending” trials were employed, the first two 
serving as practice. The score or threshold 
measure was the mean number of flashes for the 
Jast four trials. 

In each of the above tests only a brief period 
could be devoted to obtaining a score, since in 
many instances the subjects were being measured 
immediately after strenuous exercise and before 
they were covered, rubbed down, or bathed. 
Longer testing periods would have increased the 
reliabilities of the measures taken, but also might 
possibly have allowed the injurious effects of 
chilling the subjects. 


Subjects 


Twenty-four male college students ranging in 
age from 19 to 25 years were used as subjects. 
Twelve of the group were experienced collegiate 
boxers and members of the University of Mary- 
land Boxing Team for 1952. The remaining sub- 
jects were members of a Physical Education class 
in boxing and should be classed as novice boxers. 
All subjects were in excellent physical condition. 


Independent Variables and Experimental 
Design 


Each of the 24 subjects was measured three 
times on each of the tests under each of four 
conditions of the investigation. These four con- 
ditions were as follows: 


a. At rest, no previous strenuous exercise, no 
expectation of going into the ring to fight. 

b. Before fighting a three-round supervised 
bout, no previous exercise. 

c. After three rounds of very strenuous work- 
out on a heavy punching bag, not in the ring nor 
expecting to go into the ring. 

d. After fighting a three-round supervised bout 
with an opponent. 

These four conditions yield a basic 2 * 2 block 
of the experimental design, which is diagrammed 
in Table 1. It may be seen that this arrange- 
ment opposes the no-exercise conditions (F-O) 
to the heavy exercise conditions (F) for a test 
of the change in each variable as a result of 
fatigue. The test of change in each variable due 
to the anxiety occurring in the boxing situation 


S. Ross, T. A. Hussman, and T, G. Andrews 


is made by opposing the no-anxiety conditions 
(A-O) to the high anxiety conditions (A). The 
problem of fatigue in this arrangement is quite 
straightforward. The problem of anxiety, how- 
ever, offers some question. In this regard it may 
be said that all observations on and reports from 
the men immediately before and after such com- 
petitive boxing indicate severe tension and con- 
cern over the threat of pain and bodily damage 
or loss of the bout. 

In order to minimize the effects of the order 
of taking the tests in the battery, each subject 
was randomly assigned to one of the 24 possible 
orders of test administration, which he main- 
tained throughout the experiment. :.ach subject 
was measured 12 times on each tes’ chree times 
under each of the four experimental conditions. 
The restriction placed upon the order of the con- 
ditions was that the first time a subject took the 
tests he was under the rest condition so that 
giving the instructions did not interfere with the 
condition nor the reverse. 


Results and Discussion 


The results are presented and analyzed 
separately for each of the dependent vari- 
ables studied. In each case reference is made 
to the paradigm presented in Table 1, and 
the code letters used refer to the designated 
experimental conditions and their combina- 
tions as a system for presenting the obtained 
means. 

The experiment was conceived and designed 
to allow analysis of the results in two separate 
manners. The fact that each block of meas- 
ures taken on the twenty-four Ss is replicated 
twice allows the use of a within-individual 
estimate of variance to be used as an error 


Table 1 


Experimental Design Indicating the Conditions of 
Measurement and Their Relationships 


PATIGUE 


Absent Present 


\bsent 
ANXIETY 


Present 








Effects of Fatigue and Anxiety 


term to evaluate the effects of the treatment 
conditions on the variables in the population 
used. This error term contains variance of 
two types, that associated with instrument 
error and individual diurnal variation. This 
analysis is intended to test the theoretical 
and perhaps somewhat obvious question of 
whether these variables are affected by the 
treatment conditions of fatigue and anxiety 
in the sample used. 

The second analysis, which uses an esti- 
mate of the individual differences variance as 
the error term, is intended to answer the 
question of whether these test variables are 
useful as reliable indices of the independ. 
ent variables for practical application. Fre- 
quently the question of whether a variable 
changes significantly as a result of such con- 
ditions as fatigue and anxiety has been con- 
fused with the question of whether it may be 
used as an adequate indicator of these con- 
ditions. The two analyses employed test 
each of these questions in turn with what is 
felt to be the proper error term for each. 
The second analysis also provided a test of 
the replications as a main effect, thus ena- 
bling.a check on possible changes due to learn- 
ing, the presence of which of course would 
cast some question on their usefulness as 
indicators. In all cases tests of homogeneity 
of variance were satisfied. Table 2 presents 
the means for each experimental condition 
for each of the dependent variables used, ac- 
cording to the paradigm in Table 1. Tables 
3 and 4 present composite results of the tests 
of significance. Reference is made to these 
three tables in the description of results for 
each type of experimental measure. 

Steadiness. The total mean score for all sub- 
jects under all conditions was 72.23; for con- 
dition F-O = 62.0, F = 82.46, A-O = 72.25, 
A=72.22. The differences associated with 
fatigue conditions are significant at the .001 
level, as are individual differences. Anxiety 
conditions effected no change in the measures. 

There is a questionable interaction between 
fatigue and anxiety, and the interaction be- 
tween fatigue’ and individual differences is 
very significant as is the interaction of anxiety 
and individual differences. From these com- 
binations of interactions it appears that anx- 


Table 2 
Means of Experimental Results for Specitied 


Tests and Conditions 


I atigue 


&4.05 
Steadiness Anx 


SO.86 


&2.46 


Body Sway \nx 


590.36 638.24. 


Body Sway 


ons ! 
lime \ x 


$90).89 667.70 579.30 


540.62 652.97 596.80 


it) 


Tapping 
R175 
79.04 


48.450 49 471 19.00 


47.811 48.190 


48.641 18.508 


iety may act here to increase the scores of 
some individuals and decrease or not affect 
the scores of others, thus destroying the main 


effect. Anxiety then may be acting as a 
sensitizer to fatigue effects in some instances 
and a desensitizer in other instances. No 
significant change was observed in successive 





S. Ross, T. A. Hussman, and T. G. Andrews 


Table 3 


Analyses of Variance for the Specified Experimental Variables, Using “Within Individuals” 
as Measure of Experimental Error! 





MS for 
Steadiness 


MS for MS for Body 
Body Sway Sway-Time 


MS for MS for 
Source Tapping CFF 
908,664.34°** 957.03" ; 
88,235.00 
435,366.00** 


957.03*** 51 
148.78* 47.61*** 
730.70*** 149.85*** 


2,628.12*** 
333.68** 
341.42*** 


Fatigue Conditions 30,114.67*** 
Anxiety Conditions 09 
Individuals 3,048.33*** 
Interactions: 

Fat. X Anx. 

Fat. X Ind. 

Anx. X Ind. 

Fat. * Anx. X Ind. 
Error: 

Within Individuals 

(replications) 


718.83* 
322.07°"* 
301.88** 
191.63 


3.56 
64.24* 
68.32** 
77.48** 


299,215.59* 
84,111.68 
54,416.36 

141,400.42*** 


552.18" 
29.16 
94.04*** 
59.87 


Sas 
5.48*** 
4.68** 
3.30 

143.14 34.66 56,909.71 


38.17 2.41 


287 


' The asterisks identify the conventional levels of significance: * for .05, ** for .01, and *** for .001. 


measurements on the same individual under 
the same condition. The general conclusion 
here is that fatigue produces a general de- 
crease in steadiness. 


body sway, and anxiety appears also signifi- 
cantly to produce the same results. Indi- 
vidual differences are also significant here. 
The interactions were insignificant when com- 


Body Sway. ‘The total mean score for all 
conditions was 30.26; for condition F-O = 
27.24, F = 33.3, A-O= 29.18, A= 31.35. 
Fatigue effects very significantly increase 


pared with the highest order interaction, as 
recommended by McNemar (10). No sig- 
nificant effects were obtained for repeated 
measures under the same conditions. The 


Table 4 
Analyses of Variance for the Specified Experimental Variables, Using “Within Cells” 
as Measure of Experimental Error! 


MS for 
Body Sway 


MS for 
Tapping 


MS for Body 
Sway-Time 


MS for 
Steadiness 


MS for 
Source CFF 
957.03** 51 
148.78 47.61 
354.59* 76 


Fatigue Conditions 30,114.67*** 
Anxiety Conditions 09 
338.04 


2,628.12*** 
333.68* 
21.26 


908 ,664.34*** 
88,235.00*** 
Replications 128,191.06*** 
Interactions : 

Fat. X Anx. 

Fat. X Repl. 

Anx. X Repl 

Fat. X Anx. X Repl. 
Error: 

Within Cells 

(individual differences) 


718.83 
3.96 
216.58 
34.17 


3.56 
32.04 
60.59 

5.59 


299,215.59*** 
45,262.12** 
20,433.86 

12,374,698.76*** 


552.78* 
3.94 
8.53 

61.86 


51.77 
7.84 
2.41 
5.52 

276 417.28 


69.20 8,120.64 


287 





! The asterisks identify the conventional levels of significance: * for .05, ** for .01, and *** for .001. 





Effects of Fatigue and Anxiety 


general conclusion here is that fatigue and 
anxiety both increase body sway and signifi- 
cantly more for some individuals than for 
others. The results obtained serve to cor- 
roborate other studies on steadiness and body 
sway (1, 4, 5, 6, 7, 11, 18). 

Body Sway Time Scores. The total mean 
score for all conditions was 596.80; for con- 
dition F-O = 540.62, F = 652.97, A-O-= 
614.30, A= 579.30. Fatigue effects are very 
reliable in their action to increase these scores. 
However, replication measures under the 
same conditions as well as differences among 
individuals were also highly significant. Be- 
cause of these features and a very significant 
triple interaction effect, there was judged to 
be such a large amount of uncontrolled vari- 
ability that no definite conclusions are offered 
for this measure of behavior decrement. 

Tapping. The total mean score for all 
conditions was 81.47; for condition F—O = 
79.64, F = 83.29, A-O= 80.75, A = 82.18. 
As in the case of the body sway time scores, 
there is a large amount of uncontrolled vari- 
ability evidenced for the tapping measures. 
The fatigue condition acted to increase tap- 
ping reliably. However, replications within 
the same conditions as well as individual dif- 
ferences proved significant. There is prob- 
ably too great a learning factor allowed in 
the conditions of measurement of tapping. 
The results in general indicate that the tap- 
ping test would be a sensitive indicator of 
behavior decrement if the learning factor were 
better controlled. 

Critical Flicker Frequency. ‘The total mean 
score for all conditions was 48.598: for 
condition F—O = 48.555, F = 48.641, A-O = 
49.005, A = 48.190. Fatigue alone did not 
produce any significant change, but anxiety 
effects were highly significant in their de- 
crease of CFF. Individual differences were 
significant, and replication effects were not 
significant. 

Examination of the significance of the in- 
teractions in the case of CFF suggests that 
the same relationship holds between fatigue 
and CFF that obtained between anxiety and 
steadiness. The interaction between anxiety 
and individuals is significant, indicating a 
differential effect. The interaction of fatigue 


123 


and individuals is also present and suggests 
that some individuals change in one direction 
here while others change minimally or in the 
other direction, thus reducing the main effect 
that is predictable from fatigue. A more im- 
portant interaction is found between the main 
effects of fatigue and anxiety. From the spe- 
cific results obtained with CFF, it appears 
that this test may be a useful one for studies 
on behavior decrement only in situations of 
individual cases. 

Interrelationships Among the Measures. \n- 
tercorrelations were obtained among the av- 
erage scores of the tests as they appexred 
under the rest or control condition. These 
Pearson correlations were obtained only on 
the 24 Ss, and it was found that only two 
such correlations were significant. These were 
the correlations between steadiness and body 
sway (r = .550) and between steadiness and 
the body sway time scores (r = .407). 

Body sway appears to have a factor in com- 
mon with steadiness, and this is possibly the 
reason that body sway measures were found 
to be adequate indices of the stress involved 
in this investigation. It is also possible that 
the body sway test involves a factor or fac- 
tors not present in the steadiness test, because 
the former was found to be a significant indi- 
cator of anxiety effects, while this did not 
hold for the steadiness test. 

There was source of variation that 
was impossible to control, namely the actual 
amount of bodily damage or physical punish- 
ment sustained by each of the Ss during the 
conditions of competitive boxing. It was felt 
that some system should be instituted that 
would allow a possible check on the validity 
of some of the experimental assumptions, and 
so correlations were computed between the 
number of head blows received and scores on 
each of the The estimates of head 
blows were furnished by Mr. Frank Cronin, 
the University Boxing Coach, who observed 
every bout and tallied blows on a _prear- 
ranged data form. None of the correlations 
was found to be statistically significant on a 
one-tail ¢ test, which is the appropriate test 
considering the hypothesis in this case. It 
would appear that within the limits of the 
measuring techniques and the design of the 


one 


tests. 





124 


study, the number of head blows sustained 
had little or no effect on the test scores of 
the Ss. 

As part of another investigation, to be re- 
ported elsewhere, protracted boxing experi- 
ence with its attendant number of head blows 
produced no reliably indicated changes in 
the electroencephalographic records of ama- 
teur boxers, some of whom were from among 
the Ss used in the present investigation. 

As far as the present results are concerned, 
it appears that measures of steadiness more 
than the other variables tested satisfy more 
of the criteria of reliability and predictability 
to be used as indicators of behavior decre- 
ment. Hand steadiness serves for indications 
of fatigue, and body sway which is a form 
of steadiness measure serves for indication of 
either fatigue or the type of anxiety produced 
in this study, These suggestive results may 


be taken as recommendations for further in- 
vestigations under a greater variety of stress 
conditions. 

The other variables employed in this study 
may be made into more useful measures for 
studies of stress if their trial-to-trial varia- 
tion and very wide individual differences may 


be diminished by deriving scores through 
other techniques, reducing practice effects, 
and otherwise accounting for the larger rela- 
tive amounts of variability now classifiable 
as experimental error. 


Summary and Conclusions 


As part of a larger research program on 
indicators of behavior decrement, this experi- 
ment investigated the comparative value of 
several selected measures of behavior decre- 
ment under conditions of fatigue and anxiety. 
The dependent variables chosen as possible 
indicators of behavior decrement were: (a) 
steadiness; (b) body sway; (c) body sway 
time score; (d) tapping rate; and (e) criti- 
cal flicker frequency. The primary purpose 
of the experiment was to determine whether 
or not performance on each of the five de- 
pendent variables changed significantly as a 
result of intensive muscular exercise (fatigue) 
or the fear of bodily injury (anxiety) or the 
interaction of these conditions in the collegi- 
ate competitive boxing situation. 


S. Ross, T. A. Hussman, and T. G. Andrews 


Twenty-four boxers were measured under 
the following four conditions: at rest; after 
heavy exercise; before fighting; and after 
fighting. The tests were administered three 
times to each subject under each of the ex- 
perimental conditions. The analysis of vari- 
ance technique was used to test the changes 
in each variable as a function of the inde- 
pendent variables. Two separate analyses of 
the results were made: (1) using “within in- 
dividuals”; and (2) using “within cells” as 
the measure of experimental error. The re- 
sults permit the following major conclusions: 


1. Hand steadiness scores decreased sig- 
nificantly with fatigue, but not with the anx- 
iety conditions. No significant change was 
observed in successive testing on the same 
individual under the same conditions. 

2. Fatigue and anxiety significantly §in- 
creased body sway scores. 

3. Body sway time scores were found to 
be unreliable, although the fatigue conditions 
significantly increased these scores. 

4. Tapping was found to be unreliable, 
possibly due to a learning factor. Significant 
changes were found, however, in the test 
scores as a result of both fatigue and anxiety. 

5. Critical flicker frequency thresholds 
were shown to decrease significantly as a re- 
sult of anxiety. The reliability of the test 
was high, and it is felt that it may be useful 
in studies of behavior decrement in situations 
of individual cases. 

6. No relationship was found between the 
dependent variables used and the number of 
head blows received by the subjects during 
a boxing bout. 

7. Measures of steadiness more than the 
other variables tested satisfy the criteria for 
indicators of behavior decrement. Hand 
steadiness serves as an indicator of fatigue, 
and body sway (which is a form of steadi- 
ness measure) may serve as an indicator of 
either fatigue or the type of anxiety produced 
in the experiment. The remaining variables 
tested in this experiment may be made into 
more useful measures for studies of the effects 
of stress if trial-to-trial variation and the very 
wide individual differences exhibited are di- 
minished. 


Received June 8, 1953. 





Effects of Fatigue and Anxiety 


References 


. Bousfield, W. W. The influence of fatigue upon 
tremor. J. exp. Psychol., 1932, 15, 104-107. 

. Brozek, J. and Keys, A. Flicker fusion fre- 
quency as a test of fatigue. J. indust. Hyg., 
1944, 26, 169-174 

3. Dunlap, K. Improved 
and tapping plate 
430-433. 

. Eaton, M. T. The effect of praise, reproof and 
exercise upon muscular steadiness. J. exp. 
Educ., 1933, 2, 44-59 

5. Edwards, A. S. Effects of the loss of one hun- 
dred hours of sleep. Amer. J. Psychol., 1941, 
54, 80-91. 

. Edwards, A. S. Finger tremor and battle sounds. 
J. abnorm., soc. Psychol., 1948, 43, 396-399. 

7. Kellogg, W. N. The effect of emotional excite- 
ment upon muscular steadiness. J. exp. Psy- 
chol., 1932, 15, 142-165. 

. Krasno, L. R. and Ivy, A. C. The response of 
the flicker fusion threshold to nitroglycerin 
and its potential value in the diagnosis, prog- 
nosis, and therapy of subclinical and clinical 
cardio-vascular disease. 
6, 1267-1276. 

. Malmo, R. B. and Finan, J. L. A comparative 
study of eight tests in the decompression 
chamber. Amer. J. Psychol., 1944, 57, 389. 


form of steadiness tests 
J. exp. Psychol., 1921, 4, 


Circulation, 1950, 1, 


10. 


. Warren N, 


. Wells, F. L. 


. Wells, F. L. 


. Wulfeck, W. H. 


125 


McNemar, Q. Psychological statistics. New York: 
John Wiley and Sons, 1949. P. 288. 

Ryan, A. H. and Warner, M. The effects of 
automobile driving on the reactions of the 
driver. Amer. J. Psychol., 1936, 48, 403-421. 

Simonsen, E. and Enzer, E. Measurement of 
fusion frequency of flicker as a test of fatigue 
of the central nervous system: observations 
on laboratory technicians and office workers 
J. indus. Hyg., 1941, 23, 83-89 

Steinhaus, A. H. Boxers brains swapped for 
medals. J. of the Amer. Assn. for Health, 
Phystcal Ed. and Recreation, 1951, 8, 12-14. 


. Tyler, D. B. The fatigue of prolonged wakeful- 


Fed. Proc., 1947, 6, 218 

and Clark B. Blocking in mental 
and motor tasks during a 65-hour vigil. J. 
exp. Psychol., 1937, 21, 97-105. 

A neglected measure of fatigue 
Amer. J. Psychol., 1908, 19, 345-358. 


ness. 


Normal performances on the tap 
ping test before and during practice, with spe- 
cial reference to fatigue phenomena. Amer 
J. Psychol., 1908, 19, 437-483 

Fatigue and hours of service of 
interstate truck drivers. II. Psychomotor re 
actions. Publ. Hith. Bull., Washington, 1941, 
No. 265, 135-177. 





Tue Journar or Aprptirn Psycnonocy 
Vol. 38, No. 2, 1954 


Dimensional Analysis of Motion: VII. Extent and Direction of 
Manipulative Movements as Factors in Defining Motions ' 


Shelby J. Harris and Karl U. Smith 


University of Wisconsin 


In earlier investigations the problems of ex- 
tent and direction of travel movements as 
factors in determining the duration of the 
manipulative and travel components of mo- 
tion have been investigated (5, 6). Contrary 
to assumptions and observations in the fields 
of human engineering and time and motion 
study, these studies indicate that greater 
travel distances increase the duration of both 
the travel and manipulation components of 
a motion. The same experiments also show 
that the direction of travel movement affects 
only the travel time of the motion. The pres- 
ent experiment extends this line of investiga- 
tion by studying the effects of varying the 
extent and direction of manipulation on the 
component movements of travel and manipu- 
lation in the motion pattern. 


Methods and Procedure 

The apparatus (Figure 1) used in this 
study consists of an electronic motion ana- 
lyzer which has been named the analytic re- 
actometer (3). This device is designed in 
terms of two main features: (a) control of 
the space dimensions of the motion pattern; 
and (b) separate measurement of the ma- 
nipulative and travel components of motion 
through the use of special electronic relays 
(4). The electronic methods of motion 
analysis, as adapted to the present apparatus, 
are based on the principle of making the hu- 
man operator a key in a circuit consisting of 
the performance situation, the operator, an 
electronic relay and precision time clocks. 
When the subject operates one of the switches 
of the apparatus, he activates a vacuum tube 
relay causing the manipulation-time clock to 
run as long as he is in contact with the switch. 
When he ceases contact with the switch, an- 
other relay is thrown, causing the travel-time 

1 This research has been supported by funds voted 
by the Legislature of the State of Wisconsin, and 


assigned by the Graduate School Research Commit- 
tee, The University of Wisconsin. 


clock to run. This clock is stopped and the 
manipulation-time clock started again as soon 
as another switch is touched. 

The planned performance situation em- 
ployed in the experiment consists of a control 
panel, 45.7 cm. square, on which are mounted 
25 rotary switches. These switches are 
mounted in five rows of five switches and are 
spaced by a distance of 7.6 cm. Each switch 
has 17 possible settings spaced at intervals of 
20 degrees. Settings of 40, 80, and 120 de- 
grees clockwise and 40, 80, and 120 degrees 
counterclockwise are marked on the dials. 

In the design of the experiment three ex- 
tents of manipulative movement, 40, 80, and 
120 degrees, and two directions of such move- 
ment, clockwise and counterclockwise, were 
used, thus providing a total of six separate 
conditions. The over-all pattern of travel 
movement was the same on all tests with the 
subject starting at the top of the control panel 
and working from left to right through all 
five rows of switches. Each subject per- 
formed each of the six tests once at approxi- 
mately the same time on seven successive 
days. All performances were carried out with 
the right hand. 

A total of 42 right-handed men and women 
students from the elementary classes in psy- 
chology at the University of Wisconsin served 
as subjects. 

In order to control the effects of the se- 
quence and ordinal position of the six experi- 
mental conditions a 6 X 6 latin square with 
seven replications of the same square was 
used. Subjects were assigned to a given se- 
quence of tests in order of appearance for the 
experiment and were required to repeat the 
same sequence on seven successive days. 
Separate analyses of variance were performed 
on the travel and manipulation time data for 
the first and seventh days of the experiment 
only. Performances on these days are con- 
sidered to represent unskilled and skilled lev- 


126 





Dimensional Analysts of Motion. VII 


CONTROL PANEL 








Ys CONTROL SWITCHES 
ELECTRIC CLOCKS 
CMe 


saa 














CONTROL SWITCHES 


A SCHEMATIC DIAGRAM OF THE ANALYTIC REACTOMETER 


Fic. 1 


Diagram of the analytic reactometer showing the arrangement of controls on 


the panel and the timing mechanism. The inset illustrates the design of the individual 


manual control. 


The 120 degree extents, clockwise and counterclockwise, which were 


used in the experiment are not shown on the dial. 


els of performance. The choice of seven days 
of practice was arbitrary and, therefore, the 
term “skilled” is not intended to imply a 


maximum level of performance. 

Learning curves of the component move- 
ments for the various tests over the seven 
days were also constructed. 

During the testing procedure the subject 
was seated on a chair and his height adjusted 
so that his eye level was approximately equal 


to the top row of switches. The suLject was 
instructed to move the chair toward or away 
from the control panel to a comfortable po- 
sition but required to keep it centered in 
front of the panel throughout the test session. 
Prior to each of the individual tests on the 
first day, the subjects were given a practice 
trial consisting of turning the first 10 switches 
to the appropriate position. Each subject 
was instructed to turn the switches to the 
appropriate position as rapidly as possible 
and at the same time to be careful to posi- 
tion the switches accurately. Although error 
scores were not used in the analysis of the 
data, they were recorded in order to discour- 
age subjects from becoming careless with re- 
gard to accuracy. 


Results 


Figure 2 shows the learning curves for 
travel and manipulation movements for the 
three conditions involving clockwise direc- 
tion of manipulation. Analogous curves for 
counterclockwise direction were obtained, but 
since the two sets of curves are much the 


—— 40° 
—-- 80° 
—--- 120° 

sees) @O* TRAY 


MAN 


80° 
'20° 


a 
—eeees@ 


—~- - 2. 
~e 


MEAN TIME IN SEC. 
~“ o 








Fic. 2. Learning curves for 40, 80, and 120 de- 
grees extent of manipulation in a clockwise direc- 
tion. The mean times for 42 subjects are shown 
separately for the manipulation and travel compo- 
nents of motion. Analogous curves for counterclock- 
wise direction are similar to those shown. 





gepre ee 


Shelby J. Harris and Karl U. Smith 


Table 1 


Per Cent Change in Manipulation and Travel Time from Day 1 to Day 3, Day 3 to Day 7, 


and 


Day 1 to Day 7 under the Different Experimental Conditions 


Manipulation 


Day 1- 
Day 3 


Day 3- 


Exp. Cond. Day 7 


40 Deg. Right 
80 Deg. Right 
120 Deg. Right 
40 Deg. Left 
80 Deg. Left 
120 Deg. Left 


16.08 
21.11 
20.00 
16.13 
20.23 
15.19 


8.75 
8.53 
10.14 
7.09 
5.80 
6.27 


same, only those for clockwise rotation are 
shown. It is apparent that over the seven- 
day period the manipulation-time scores show 
considerably greater improvement than the 
travel-time scores. The major difference in 
the rate of improvement between the two mo- 
tion components is during the first three days 
of practice. Quite similar practice effects are 
found for the two component movements over 
the last four days of the experiment. These 


changes in performance are shown in Table 1 
in terms of per cent change from day one to 


day three, from day three to day seven, and 
from day one to day seven. An analysis of 
variance performed on the data for days one 
and seven indicates that the changes in dura- 
tion of both travel and manipulation move- 
ments over the seven-day period are signifi- 
cant at the .001 level of confidence.’ 

The data on the effects of direction and ex- 
tent of manipulation were first examined for 
homogeneity of variance between the differ- 
ent experimental conditions. A Bartlett chi- 
square test for homogeneity of variance ap- 
plied to the time-score data for days one and 
seven proved significant, thus necessitating a 
logarithmic transformation of the data. All 
of the analyses were performed on the trans- 
formed data. 

In order to evaluate the effects of the vari- 
ous sequences of tests, analyses of variance 

2 The raw data and the summaries for the analysis 
of variance for this experiment are on file at the 
University of Wisconsin in the master’s thesis of 
Mr. Shelby Harris entitled “Dimensional Analysis of 


Motion: The Factors of Direction and Extent of 
Manipulative Movement in Motion.” 


Travel 


Day 1 


Day 1 
Day7 


Day 3 


Day 3 
Day7 


23.43 
27.84 
28.11 
22.08 
24.86 
20.50 


1.20 
R3 
1.49 
— 2.33 
— 3.95 
2.96 


5.92 
8.00 
10.41 
7.53 
10.78 
11.05 


were performed on the travel and manipula- 
tion data from the latin squares for days one 
and seven. In no instance was the sequence 
of tests a significant variable. 

Figures 3 and 4 show the relation between 
the mean travel and manipulation times and 
the extent and direction of manipulation on 
days one and seven respectively. The means 
shown in the figures have been computed on 
the transformed scale and then converted 
back to the original scale. As may be seen 
from the graphs, the mean manipulation 
times increase considerably with increased 
extent of manipulation for both clockwise 
and counterclockwise directions of movement. 
The function relating the two is approxi- 
mately linear. With the exception of the 120- 
degree movement on day one, the mean ma- 
nipulation times for clockwise direction are 
consistently less than the comparable figures 
for counterclockwise direction. The mean 
travel times also increase with increased ex- 
tents of manipulation in both directions, but 
the increase is not as pronounced as it is for 
manipulation times. Inspection of the travel 
time curves suggest that the relation between 
mean performance and extent of manipula- 
tion also approximates linearity. 

Summary tables for the analyses of vari- 
ance performed on the relations discussed 
above are shown in Table 2. Extent of ma- 
nipulation is significant at the .05 level or 
greater for both the travel and manipulation 
components on both days one and seven. Di- 
rection of manipulation is significant only for 
the manipulation component on day seven. 





Dimensional Analysis of Motion. VII 


—— MAN. RIGHT 
~ Sry 
TRAV. RIGHT 

“ LEFT 


IN SEC, 


MEAN TIME 








40 80 120 
EXTENT OF MANIPULATION 
IN DEGREES 


Fic. 3. Mean times for the 42 subjects for 40, 80, 
and 120 degrees extent of manipulation in clockwise 
and counterclockwise directions. The means. shown 
are for the first day of the experiment 


Clockwise direction is significantly superior 
to counterclockwise direction in this instance 
at the .05 level. Direction does not have 
a significant effect on the manipulation or 
travel times on day one, or on the travel 
component on day seven. None of the direc- 
tion by extent interactions is significant. 


Table 2 


Summary of Analysis of Variance for Direction and 
Extent of Manipulation for Travel and Ma 
nipulation Components of Motion 


Manipulation Travel 


Source of 
Variation df. F ; F 


Day One 
Direction 1 
Extent 2 
Interaction 2 
Error 246 


14.07*** 


72 91*** 


Day Seven 
Direction 


1 3.04* - 
Extent 2 


59.19*** 4.29** 
Interaction 2 ~ 
Error 246 


* Significant at .05 level 
** Significant at .01 level. 
*** Significant at .001 level. 


Summary and Conclusions 


Forty-two subjects were tested on a task 
involving repetitive switch turning under six 
different experimental conditions. These con- 
ditions consist of three extents of manipula- 
tion, 40, 80, and 120 degrees, and two direc- 
tions of manipulative movement, clockwise 
and counterclockwise. Special devices, in- 
volving electronic motion analysis techniques 
and a special planned work situation, are 
used to obtain separate measurement of the 
travel and manipulation components of mo- 
tion under controlled conditions. Each sub- 
ject performs one trial under each of the ex- 
perimental conditions on 
days. 

Learning curves for the travel and ma- 
nipulation components of motion are pre- 
sented. Analyses of variance, performed on 
the data for days one and seven, are sum- 
marized to indicate the significance of differ- 
ences in the duration of travel and manipula- 
tion movements in relation to the direction 
and extent of manipulation. 

The results of the study may be summa- 
rized as follows: 


seven successive 


1. Manipulation movements show a con- 
siderably greater improvement due to practice 


15 
——— MAN. RIGHT 
¥ Lary 


TRAV. RIGHT 
“3 LEFT 


@ 


IN SEC. 


MEAN TIME 








40 80 120 
EXTENT OF MANIPULATION 
IN DEGREES 


Fic. 4. Mean times for the 42 subjects for 40, 80, 
and 120 degrees extent of manipulation in clockwise 
and counterclockwise directions. The means shown 
are for the seventh day of the experiment 





130 


than travel movements. This differential 
learning effect is evident primarily over the 
first three days of practice. The change in 
performance from day one to day seven is 
highly significant for both motion compo- 
nents, 

2. Duration of both manipulation and 
travel time is significantly increased with 
greater extents of manipulative movement 
and, at least for the extents investigated, the 
relations are approximately linear. 

3. Clockwise direction of manipulative 
movement is observed to be significantly 
superior to counterclockwise movements for 
the manipulation motion component on day 
seven. Direction of manipulation is not re- 
lated to duration of manipulation time on 
day one, nor were travel times affected by 
the direction of manipulation on day one or 
day seven. 

Previous studies (6) have shown that the 
durations of both the travel and manipulation 
movements are related to the distance of 
travel movement. Inasmuch as the observa- 
tions reported here show that increasing the 
extent of manipulation lengthens the dura- 
tion of both manipulation and travel, it ap- 
pears that varying the extent of any of the 
components of motion will have effects on 
other component movements involved in the 
pattern. Direction of travel movement has 
previously been shown to affect only the 
travel component of the motion (5). Thus, 
it appears that direction of movement, at 
least under conditions investigated thus far, 
influences only the component of motion 
within which the directional factor occurs. 
Further research to determine more precisely 
under what conditions direction becomes a 


Shelby J. Harris and Karl U. Smith 


relevant variable in determining the duration 
of movement is needed. 

The effort to systematize the study of hu- 
man motions in industry and in some phases 
of human engineering have led both psycholo- 
gists and engineers to assume that varying 
the extent of movement has no influence on 
the duration of such movement (1, 2). Such 
assumptions have been proven, through di- 
mensional and component motion analysis, to 
be erroneous for both manipulative and travel 
components of human manual motion. Both 
manipulation and travel times vary signifi- 
cantly with increasing extent of motion. In 
addition, the increase in extent of one of 
these component movements in a complex 
task produces, through interaction of the 
different movements, significant increase in 
duration of the other component movements 
in the task. 


Received June 18, 1953. 


References 


. Barnes, R. M. Motion and time study (3rd Ed.). 
New York: John Wiley, 1949. 

. Ellson, D. G. The application of operational 
analysis to human motor behavior. Psychol. 
Rev., 1949, 56, 9-17. 

. Harris, S. J. and Smith, K. U. Dimensional 
analysis of motion. V. An analytic test of 
psychomotor ability. J. appl. Psychol., 1953, 
37, 136-142. 

. Smith, K. U. and Wehrkamp, R. A. A universal 
motion analyzer applied to psychomotor per- 
formance. Science, 1951, 113, 242-244. 

. Von Trebra, Patricia A. and Smith, K. U. Di- 
mensional analysis of motion. IV. Transfer 
effects and direction of movement. J. appl. 
Psychol., 1952, 36, 348-353 

. Wehrkamp, R. and Smith, K. U. Dimensional 
analysis of motion. II. Travel distance ef- 
fects. J. appl. Psychol., 1952, 36, 201-206. 





Tue Journat or Appiiep Psycnoiocy 
Vol. 38, No. 2, 1954 


Discussion of Gilliland and Newman’s “The Humm-Wadsworth 
Temperament Scale as an Indicator of the ‘Problem’ 
Employee 


99 1 


D. G. Humm and Kathryn A. Humm 


Humm Personne! Consultants, Los Angeles, California 


The first question to be raised is that of 
the title of the article in question. The meth- 
odology reported does not represent the meth- 
odology recommended for using the Humm- 
Wadsworth Temperament Scale in the ap- 
praisal of employees and applicants for 
employment. If the article had been given 
some such title as “The Integration Index 
and Component Control Measures Computed 
from the Humm-Wadsworth: Temperament 
Scale as Indicators of the ‘Problem’ Em- 
ployee,” we would be less dissatisfied with it. 

It is implied by Gilliland and Newman 
that they used Humm’s procedures in classi- 
fying their subjects according to “risk,” as 
described in our study of Los Angeles police- 
men * and as discussed in personal conference. 

On the contrary, we recommend evaluating 
Humm-Wadsworth findings for each subject 
tested by considering each of the following: 
(1) the raw scores, corrected for response- 
bias, in comparison with the scores of the sub- 
jects of the original standardization study; * 
(2) the degree of response-bias itself, since 
atypical response-bias has been found to be 
an indicator of tendencies to problem behav- 
ior; (3) the positions of the seven compo- 
nents in the distributions of the scores of em- 
ployed subjects, but without any implication 
that conformity to the central tendency is 
necessarily desirable; (4) the relationship of 


1 Gilliland, A. R. and Newman, S. E. The Humm- 
Wadsworth Temperament Scale as an indicator of 
the “problem” employee. J. appl. Psychol., 1953, 
37, 176-177. 

2Humm, D. G. and Humm, Kathryn A. Humm- 
Wadsworth Temperament Scale appraisals compared 
with criteria of job success in the Los Angeles Police 
Department. J. Psychol., 1950, 30, 63-75. 

3 There were seven pairs of groups, rather than 
seven groups, and they were not “relatively pure 
types.” A partial regression technique had to be 
used to “purify” the data. See: Humm, D. G. and 
Wadsworth, G. W., Jr. The Humm-Wadsworth 
Temperament Scale. Amer. J. Psychiat., 1935, 92, 
1, 163-200. 


131 


the Normal component to each of the other 
components (the component control meas- 
ures) and to the temperamental pattern as 
a whole (the integration index), and (5) 
finally, the temperamental pattern as a whole, 
derived from all of the measures previously 
mentioned, and indicating which components 
are likely to be conspicuously manifested in 
the subject’s behavior and whether their 
manifestations will be desirable or undesirable 
in the situation for which the subject is being 
considered. 

In general personnel work, we assign a risk 
rating on the basis of Humm-Wadsworth find- 
ings alone only when those findings are so 
unmistakably unfavorable as to constitute an 
insurmountable handicap even if all findings 
concerning ability should be found to be fa- 
vorable. In our report of the study of Los 
Angeles policemen, we attempted to make it 
clear that we used the Humm-Wadsworth 
findings alone, without partialling out other 
factors related to job success, because the 
policemen in question had been pre-selected 
by the civil service procedure. 

When Gilliland and Newman classified their 
subjects on a scale which they do not identify 
and which we cannot recognize for the Inte- 
gration Index and the Component Control 
Measures and then assigned risk ratings on 
the basis of Very Good for all ratings above 
five and Very Poor whenever any two ratings 
were as low as one, they were using a pro- 
cedure we have never recommended 
strongly disapprove. 

The explanations offered by Gilliland and 
Newman for the outcome of their study seem 
to us not to follow from the data reported, 
ie.; “(1) the test may not adequately meas- 


and 


*Humm, D. G., and Humm, Kathryn A. Meas- 
ures of mental health from the Humm-Wadsworth 
Temperament Scale. Amer. J. Psychiat., 1950, 107, 
6, 442-449. 





132 


ure the components it purports to measure” 

-no data are presented to indicate whether 
or not the behavior of the subjects differed 
from the behavior that might have been pre- 
dicted from the Humm-Wadsworth results; 
(2) these components may not be essential 
to success in this industry”—the study in- 
vestigated only a specific set of measures and 
did not do this in a way which could justify 
such a conclusion; “(3) the company cannot 
distinguish between satisfactory and unsatis- 
factory workers”—no data are presented of 


D. G. Humm and K. A. Humm 


the procedures used by the company for de- 
termining satisfactory work or for deciding 
to discharge an employee. 

The only conclusion we are able to draw 
from this study is that it supports our own 
contention that over-simplified procedures are 
inadequate for appraising workers, but that 
it offers no evidence as to the effectiveness of 
the Humm-Wadsworth, properly used, as one 
of the tools for personnel appraisal. 

Received July 31, 1953. 
Published out-of-turn by the editor. 





Tue Journat or Appiiep PsycHoLocy 
Vol. 38, No. 2, 1954 


Applied Psychology in Action 


Comment on Word Meaning 


Fred L. Wells 


Department of Hygiene, 


In the December 1953 issue of the Journal 
of Applied Psychology, Dr. H. D. Hadley has 
an insightful note: “The Non-Directive Ap- 
proach in Advertising Appeals.” For the text, 
I wonder if there would be any interest in a 
distinction between credibility and credulity. 
The latter term seems the one fitting Dr. 


THe JourNaL or Appiiep PsycHoLocy 
Vol. 38, No. 2, 1954 


A Note on “The Non-Directive Approach in Advertising Appeals 


Harvard University 


Hadley’s context better, but the word used is 
credibility (page 496, line 5 from end, page 
497, bottom of column one). There is ap- 
parently an “obsolete” use of credibility in 
Dr. Hadley’s sense. (See Webster.) If in- 
terpreted in the current usage, this makes 
the author’s meaning difficult to follow. 


Mary Epstein 


George Peabody College for Teachers 


The December 1953 issue of J. appl. Psy- 
chol. carried an article, the aim of which was 
to point up some similarities between certain 
types of therapy and advertising techniques. 
One of the conclusions reached, that “the 
non-directive technique is quite comparable to 
the inferred technique in advertising ... ,” 
needs further elaboration. 

Showing the benefits of a product, without 
intention to sell (the inferred technique) 
seems, according to some standards, superior 
to the direct appeal, where the advertiser 
tells the consumer to buy the product. Often, 
the direct appeal involves threat to the buy- 
er’s values, particularly when his attention is 
called to the fact that the purchasing of an- 
other product than the one advertised may 
lead to various undesirable results. The in- 
ferred technique minimizes the effects of 
threat, by emphasizing the acceptability of 
the “thing,” and by associating it “with very 
acceptable things, persons or events.” 

Comparison between non-directive therapy * 
and inferred advertising can possibly be made 


1 Hadley, H. D. 
advertising appeals 
498. 

2 Rogers, C. R. Client-centered 
cago: Houghton-Mifflin Co., 1951. 


The non-directive 
J. appl. Psychol., 


approach in 
1953, 37, 496- 


therapy. Chi- 


concerning the attempt to reduce threat. The 
advisability of reducing threat, in any field 
of human endeavor, is psychologically sound. 
Further comparison, however, can only be 
drawn by doing injustice to the basic prin- 
ciples of client-centered therapy. 

A closer examination of the assumptions 
on which non-directive therapy rests reveals 
that a belief in the client’s ability to develop 
his own value system is a sine qua non of 
successful therapy. To facilitate this process, 
the therapist tries to minimize, as 1puch as 
possible, the effects his own value system 
might have on that of the client. 

The principle of non-interference is not ap- 
plicable in the field of advertising, because, 
carried to its logical conclusion, it would 
mean that the advertiser not sell at all. 
Placed in the non-directive framework, where 
selling the client on anything is verboten, the 
advertiser would be in no better position to 
make a product look favorable in the eyes of 
the buyer than is the client-centered therapist 
in the position to steer the client’s value judg- 
ment in the direction of his own. There is a 
fundamental difference between non-directive 
therapy and advertising. The difference lies 
in the realm of commitments and intentions. 
The advertiser is committed to and intends 





134 


to sell. The therapist aims to help the client 
achieve more satisfactory adjustment, i.e., 
happiness, regardless of the values adopted 
or discarded in the process of therapy. 
Whereas there is no reason to doubt that 


Tue Journat or Appiiep PsycHoLocy 
Vol. 38, No. 2, 1954 


Applied Psychology in Action 


some of the elements of non-directive therapy, 
such as reduction of threat, might prove help- 
ful in raising advertising standards, an un- 
qualified comparison between this type of 
therapy and advertising is inappropriate. 


The Measurement of Academic Freedom 


Willard Kerr 


Illinois Institute of Technology 


Can academic freedom be measured? 
Through most of man’s history it has existed 
in such minute quantity and excited so little 
interest as to discourage evaluation. Today, 
despite the current wave of anti-intellectual- 
ism, academic freedom exists in a magnitude 
unknown to antiquity. But from one insti- 
tution to another, where scholars work, there 
is great variation in academic freedom. 

In 1953, the Academic Freedom Commit- 
tee, Chicago Division of the American Civil 
Liberties Union, attempted to measure aca- 
demic freedom in each of the more than 50 


institutions of higher learning in the State of 


Illinois. With the aid of other members of 
the committee and the ACLU booklet en- 
‘ titled Academic Freedom and Academic Re- 
sponsibility, a two-page “test” of academic 
freedom was constructed. It was called the 
“Academic Freedom Survey.” It contained 
twelve items on rights of students, seven on 
rights of teachers, and four general rights. 
Each item was answered on a three-point 
scale of “Extent to which right is effectively 
assured—complete; as a general rule; very 
little or none.” Possible scores could range 
between 23 and 69. 

Design. Approximately 200 of the ques- 
tionnaires were mailed to Illinois colleges ad- 
dressed to: (a) one administrator, usually 
the president; (b) one or more professors; 
and (c) one or more student leaders, usually 
the newspaper editor or student council presi- 
dent. 

Results. A total of 73 replies was received, 
and, while analysis of the data still continues, 
the obtained data do indicate substantial 
freedom variations. The most entrenched 
freedoms are: for faculty, freedom from spe- 


cial requirements (oaths), of association in 
faculty organizations, of citizenship activities, 
and of research; and for students, freedom of 
choice of faculty advisers. For faculty, the 
least secure freedoms relate to faculty self- 
government, to tenure (security), and free- 
dom to criticize curriculum and administra- 
tion. For students, the least secure are to 
hear outside speakers, to criticize faculty and 
administration, to organize associations and 
affiliate nationally, of press, of petition, and 
of reasonable off-campus activity. 

These results suggest that serious deficien- 
cies exist in academic freedom for both fac- 
ulty and students, particularly the latter. 
While our young people are expected to hold 
themselves ready to fight and die for our 
country, yet we withhold from them the rea- 
sonable freedoms which make for individual 
responsibility and character growth. This 
statement is qualified by the fact that most 
institutions maintain an admirable situation 
with respect to most of the freedoms studied. 
Total results as analyzed to date indicate 
that the “test” may be most useful as a diag- 
nostic instrument to indicate areas for reme- 
dial attention in a given individual institution. 

The Chicago Division, ACLU Academic 
Freedom Committee now plans to resume 
such surveying, but this time with “Student,” 
“Administrator,” or “Faculty” stamped on 
each form in order to establish points of 
agreement and disagreement among these 
three groups and thus more clearly delineate 
the freedom areas requiring attention in each 
institution. This kind of impartial service in 
the interests of human freedom is traditional 
in the history of the American Civil Liberties 
Union. 





Book Reviews 


Lincoln, J. F. Incentive management. Cleve- 
land: The Lincoln Electric Company, 1951. 
Pp. 280. $1.00. 


This volume is written by the president of 
the Lincoln Electric Company, which manu- 
facturés electric welding equipment. It is an 
exposition of the rationale for the system of 
incentive management upon which the com- 
pany is run. The rationale, as digested by 
the reviewer, is as follows: 


1. The primary goal of industry is to make 
a better product to be sold to more people at 
a lower price; a reasonable profit to the stock- 
holders is also important but should be a sec- 
ondary by-product. 

2. This goal is possible only under condi- 
tions of free enterprise and ever increasing 
efficiency of operation. 

3. Such levels of efficiency are possible only 
if workers are motivated to develop their la- 
tent abilities, which are limitless under proper 
incentive conditions. 

4. Workers will develop their latent abili- 
ties only if they are given a direct reward for 
their individual contribution to production. 

5. This direct reward is obtained through 
an incentive wage system and recognition of 
the individual’s ability. 

As can be seen from the above, the ra- 
tionale is essentially an application, in mod- 
ern industry, of the law of competitive strug- 
gle for existence and the survival of the fit. 
Fired by the knowledge of reward and recog- 
nition for demonstrated superiority, human 
beings have limitless possibilities of improve- 
ment. Through “intelligent selfishness” man 
strives on and on toward perfection and great 
strides of progress result. 

The author is thoroughly convinced that 
this rationale is true. Why is he so sure? 
Because under his management according to 
these beliefs, his company has become the 
most productive organization in the industry. 
Charts and tables (in the appendix) show 
that prices of Lincoln-made products steadily 
declined from 1933 to 1949 while those of 
comparable products increased. Sales value 
per employee is double that of the average in 
other industries and other companies in the 
same industry. There is no union; there 
were no work-stoppages due to labor-man- 
agement disputes in any year from 1934 to 


1949. Productivity increased 15% per year 
from 1934 to 1949 compared with only 3% 
per year for all manufacturing industries. The 
average Total Compensation per Employee 
was $7701 in 1950 compared with between 
$3000 and $4000 for six other well-known 
companies, some of which are competitors. 
As the author states, “The conclusion that 
must be drawn from these facts is obvious 
. . . The American economy must adopt in- 
centive management.” 

This is a very difficult book for a psycholo- 
gist to evaluate. Research-oriented, he looks 
for a statement of hypotheses, description of 
procedures designed to test the hypotheses, 
presentation and interpretation of results, 
and conclusions derived from the findings. 
In this book, however, the author presents 
merely an exposition of the “hypotheses,” 
which he presents as axioms, and his proof 
of their validity is the ultimate criterion of 
the production record of the company. Just 
how the principles are translated into opera- 
tion procedures and what the relative con- 
tribution of these procedures is to the over- 
all success of the company are not given, 
either in this volume or in a previous book 
entitled Lincoln’s Incentive System. ‘The 
reader is left with a feeling of something 
missing and with nothing to evaluate objec- 
tively and critically. It is like a father who 
dogmatically states that proper diet results 
in healthy children and proudly points to his 
six-foot son as the proof. Can one conclude 
that diet increases height? Certainly not 
without knowledge of what diet how admin- 
istered, and of related variables such as exer- 
cise, height of parents and grandparents, etc. 

This is not necessarily to deny his thesis; 
it is just that he hasn’t proved it. In fact, 
except for the fundamental weakness of any 
explanation of human behavior which rests 
upon a single source of motivation, the ex- 
position of his thesis is a fairly well-reasoned, 
consistent, thought-provoking presentation. It 
is not difficult to accept his premises of the 
desirability of direct and immediate reward 
for individual effort, of the importance of 
overt recognition of individual achievement, 
of individual identification with group goals 
through stock ownership in the company, of 
the greater value of the earned security re- 


135 





136 


sulting from self-confidence and assurance of 
reward than that granted by a paternalistic 
employer or government. The author has a 
supreme confidence in the ability of man to 
rise to new heights of performance and em- 
phasizes employee development rather than 
selection. The industrial progress which has 
made us a leading nation can be maintained 
only with the continuous change resulting 
from the struggle for existence under condi- 
tions of free competition. The “profit mo- 
tive” is reinstated in full force but with the 
“profit” more equitably distributed—the ma- 
jor share going to the consumer, through 
lower prices, and thereby to the workers who 
themselves are consumers. 

One wonders, however, whether the Lin- 
coln Incentive System is universally appli- 
cable. It is conceivable that it works at 
Lincoln primarily because it is unique. No- 
where in the book does Lincoln discuss selec- 
tion standards or turnover or what happens 
to those employees who do not produce at a 
high level. Assuming that they are weeded 
out or allowed to weed themselves out 


through low returns from piece-work wages, 
the company may have a highly selected work 


force—selected in terms of their responsive- 
ness to the particular type of incentive and 
the level of performance required by his sys- 
tem. One is reminded of the tremendous 
spurt to production accompanying Ford's in- 
troduction of the $5 a day wage but also of 
the levelling off which resulted as time went 
on. 

One is forced to admire the positiveness 
with which Lincoln obviously believes in the 
philosophy underlying his system and the 
courage with which he applies it. He is not 
“afraid” to pay a low-level production worker 
whatever the worker can earn under piece- 
work rates rigidly maintained. He is much 
more concerned with the benefit to the con- 
sumer than to the stockholder, who adds 
nothing to the productive effort. He has 
sincerely attempted to put his beliefs into 
actual practice without compromise and is 
thoroughly convinced of the validity of his 
beliefs. As he states in the companion vol- 
ume previously referred to, “Whatever the 
conclusions of the reader, there is no doubt 
that the incentive-management philosophy 


Book Reviews 


outlined herein is fundamental to man, 
whether he is playing a game, raising a 
garden, or living a life.” I wish that psy- 
chologists could be as positive in their know]- 
edge of human behavior and its application 
to life situations. 

Psychologists can profitably read this book. 
They will feel successively annoyed, amused, 
disturbed, provoked, and challenged. The 
price is only one dollar, an example of the 
author’s philosophy of lower prices for the 
consumer. 

Albert S. Thompson 


Teachers College, 
Columbia University 


Motivation and morale in 
Pp. 


Viteles, Morris S. 
industry. New York: Norton, 1953. 
xvi + 510. $9.50. 

Any new book by Viteles is bound to com- 
mand attention. As one of the first and still 
one of the leading industrial psychologists, 
his work merits and gets the attention of 
workers in the field. 

Viteles’ well known Industrial Psychology 
was first published in 1932, and since that 
time has been considered a classic, if not the 
classic text in the field. Drawing heavily 
upon the experience of psychologists in the 
laboratory as well as in industry, Viteles gave 
a comprehensive picture of the development 
and current status of industrial psychology 
which at that time was considerably less 
robust than it is today. Advocating the view 
that the scope of industrial psychology was 
as extensive as that of psychology itself, 
Viteles nevertheless emphasized individual 
differences. 

In many respects, Motivation and Morale 
in Industry is a continuation of /ndustrial 
Psychology. To a considerable extent Viteles 
has repeated his earlier pattern, but with a 
shift in emphasis from the individual to the 
group. He is still interested in increasing 
productivity, but his frame of reference is 
employee satisfaction and industrial harmony. 
Again he has drawn heavily upon the experi- 
ence of psychologists in the laboratory as 
well as in industry. Again he has synthesized 
the work of other persons. Again he has 
pointed out trends and probable trends. 

Motivation and Morale in Industry is di- 
vided into five parts. The first, corsisting of 





Book Reviews 


three chapters, is introductory in nature. It 
deals primarily with the economic man and 
the inadequacy of the concept that man can 
live by bread alone. The fifth part, consist- 
ing of four chapters, summarizes and draws 
together the remainder of the book as well 
as makes applications and recommendations. 
The remaining three parts, totaling sixteen 
chapters, comprise the bulk of the book. 
They deal with motivational theory, experi- 
mental studies, and employee attitude surveys. 

Motivation and Morale in Industry is ec- 
lectic. The bibliography refers to books and 
articles from all fields of psychology. Psy- 
choanalytical, topological, and Gestalt psy- 
chology are represented as well as the more 
traditional fields. Various allied disciplines 
are also represented such as philosophy, eco- 
nomics, sociology, endocrinology, medicine, 
and anthropology. Reference is made to 
publications in various languages and to re- 
search conducted in various countries. In- 
cluded are Canada, England. Germany, Rus- 
sia, and the Netherlands. An extensive pe- 
riod of time is covered, from William James 
to the present. Business and trade publica- 


tions are referred to; for example, National 


Industrial Conference Board, National Asso- 
ciation of Manufacturers, Factory, and Dun’s 
Review. Nontechnical journals and books 
are also included such as New York Times 
Magazine, Survey Graphic, Fortune, and 
Readers Digest. 

The book is scholarly. In fact it is prob- 
ably too scholarly to be of maximum value 
and use to the audience to whom it is di- 
rected: management in business and industry. 
The book is far more suitable for students 
preparing for work in management. The 
typical business executive is impeded rather 
than helped by phrases such as “sine qua 
non,” “in vacuo,” and the like. He does not 
need nor desire references in foreign lan- 
guages. He does not readily accept the in- 
volved sentence structure or the laborious 
style used by Viteles. A Flesch count of this 
book would place it far beyond the “com- 
fortable” reading level of most management 
people. A re-write of Motivation and Morale 
in Industry will be required if it is to gain 
wide acceptance in industry. Viteles did this 
earlier when /ndustrial Psychology was re- 


137 


written to fill the need for a shorter and 
simpler volume, and was published in 1934 
under the title The Science of Work. 

Viteles has written his book from a theo- 
retical and experimental viewpoint. This is 
well illustrated by his statement “Effective 
results can be achieved only through system- 
atic research conducted within a sound theo- 
retical context” (p. 66). Would that all 
workers in industrial psychology took this 
view! 

In a sense this book is too much a book of 
readings in motivation and morale in in- 
dustry. Many of the studies are weak, but 
Viteles has done an excellent service in col- 
lecting these studies in such way as to illus- 
trate the primitive status of the field. Fre- 
quently he has added his penetrating insights 
relative to such studies. Nevertheless, the 
reviewer regretted that Viteles had not taken 
a more directly critical view. It were as 
though a skilled surgeon held his scalpel to 
the skin but neglected to make a sharp and 
deep incision. Why? Does Viteles feel less 
sure of himself in the area of the group than 
in that of the individual? Do his own feel- 
ings emphasize the individual, but his intel- 
lect tell him to emphasize the group? Or, is 
he a highly tolerant man who is convinced 
that more harm than good would result from 
a more critical attitude at this time? 

The book was published prematurely in one 
respect. References were added after the 
type was set without changing numbering. 
Thus, the same number frequently appears 
successively, the second being followed by a 
letter subscript. This may be minor, but is 
apt to give some readers the impression of 
haste or carelessness which is inappropriate 
in a book of this type. It is hoped that re- 
printing will correct this defect. 

In spite of its deficiencies, this is a book 
which should be studied carefully by all who 
profess to be interested in industrial psychol- 
ogy. It pulls together much material which 
has lacked structuralization. In so doing 
Viteles has done a valuable service albeit the 
material is primitive. Out of such syntheses 
can come considerable improvement in future 
motivational theory and experimentation. 

In writing this book, Viteles has not blindly 
jumped onto the band wagon of “group 








138 


think.” ‘This is particularly refreshing inas- 
much as so many psychologists seem to dis- 
regard their own teachings and to follow the 
‘all or none” hypothesis in evaluating schools, 
viewpoints, methods, and procedures in the 
field of psychology. As Viteles points out, 
“The emergence of a ‘social psychology’ does 
not require or justify the abandonment of 
‘individual psychology’ in approaching or 
solving the problems of motivation and mo- 
rale in industry” (p. 391). 
Clifford E. Jurgensen 
Minneapolis Gas Company 


Redfield, Charles E. Communication in man- 
agement. Chicago: University of Chicago 
Press, 1953. Pp. xvi+ 290. $3.75. 
Redfield’s book presents an excellent broad 

view of the problem of communication in 

industry as well as information on how to 
handle rather specific problems. The author 
states that while the means of communication 
have now reached their greatest development, 

“intelligibility” in industrial communication 

is at its lowest stage in history. This, he 


says, is due to: (a) the increasing size of 
modern organizations; (b) lack of training in 
wise language usage; and (c) the specializa- 


tion and segmentation of work today. Of 
the importance of communication, however, 
Redfield leaves no doubt when he quotes 
Fortune’s new motto for business,—*Com- 
municate or Founder.” 

The book is arranged in five parts. The 
first part provides a general introduction to 
the problem, and contains highly useful guid- 
ing principles for effective communication. 
It is necessarily general in scope, but it does 
seem to give too little attention to one as- 
pect of communication, effectiveness as a 
function of the educational differences of 
“communicator” and “communicatee.” The 
goal is stated as having members of the audi- 
ence improve their language facility (as well 
as having the communicator improve his way 
of using language). This is- desirable, but 
the practical question remains whether or not 
communication can be effective if the reader 
or hearer cannot understand. It would have 
seemed worth while to present more informa- 
tion on how to make writing and reading more 
understandable, and how to check on this 
through readability formulas. Redfield, it 


Book Reviews 


should be said, does not deny the importance 
of the problem, however, for he says earlier 
that “In the America of the 1950’s, literacy 
will have to be measured in terms of compre- 
hension of transmitted ideas and concepts.” 

Part II of the book takes up ‘“‘communica- 
tion downward and outward,” the most im- 
portant aspect of which is order-giving. After 
a description of kinds of orders, Redfield goes 
on to a discussion of oral versus written 
presentation. He then takes up individual 
messages and circulars, manuals, and hand- 
books. The presentation is thorough, but this 
very thoroughness in itself leads to some gen- 
eralizations that may not always be accurate. 
For example, Redfield says a safe rule of 
thumb in distinguishing manuals and hand- 
books is that “if personal pronouns appear in 
the text, it is a handbook and not a manual.” 
This is, however, a minc~ point as far as the 
whole presentation is concerned. 

In Part III, Redfield presents “communi- 
cation upward and inward.” He gives chief 
attention to “administrative reporting” as es- 
sential to the executive, but also takes up 
suggestion (and complaint) systems, inter- 
views, and employee opinion polls. The over- 
all presentation is excellent, and should in- 
troduce new approaches to many readers. 

Part IV of the book is an interesting pres- 
entation of “horizontal communication,” or 
such cross-talk as clearance, review, and con- 
ferences. Horizontal communication, as Red- 
field points out, is of increasing importance 
because of growing specialization in industry. 

In the final section of the book (Part V). 
Redfield presents his views of the future of 
communication in management. The pres- 
entation is largely in terms of organization in 
management and its relation to communica- 
tion. Recent changes in organizational struc- 
ture (reduction of number of management 
levels) in several large corporations, and the 
effect on communication, provide interesting 
reading. 

All in all, the book is a valuable one, 
chiefly for its survey of the field and its com- 
plete list of references and selected readings. 
It should prove useful to most readers con- 
cerned with management, but particularly to 
those who have not recognized the extent of 
communication that goes on in industry, or 





Book Reviews 


how more effective communication can im- 
prove industrial efficiency. 
George Klare 
University of Illinois 


Tyler, Leona E. The work of the counselor. 
New York: Appleton-Century-Crofts, 1953. 
Pp. 323. $3.00. 

During the past four years an unusual num- 
ber of textbooks on counseling have appeared. 
Some of the texts have been elaborations or 
developments of the nondirective point of 
view; some have been restatements of older 
points of view modified to incorporate a 
greater emphasis on counseling as contrasted 
with diagnosis; and one or two have been at- 
tempts at something like a synthesis. Tyler's 
text belongs in this last category, and, in this 
reviewer's judgment, is outstandingly success- 
ful in this class. 

As the title indicates, Tyler has attempted, 
not to describe a theory of counseling, but to 
write of the peculiar work of the counselor. 
marshalling ideas from experience and from 
research to throw light on how counseling 
may most successfully be done. It is there- 
fore an eclectic book in its approach, pre- 


dominantly nondirective in its philosophy and 
techniques, but making use of the contribu- 
tions of testing, occupational information, and 


environmental resources in a manner more 
commonly associated with other points of 
view. Tyler makes her own synthesis of 
these approaches. The result is a very read- 
able text, suitable for relatively unsophisti- 
cated students, in which each chapter con- 
cludes with a concise critical summary of 
relevant research which makes the text ap- 
propriate for students with more background 
and for practitioners. 

The. functions of the counselor in modern 
society are effectively dealt with in Chapter 
I, thus starting out by putting the counselor's 
work in good social and psychological per- 
spective. Chapter II discusses interviewing, 
stressing the perceptual skills of the counselor 
and reflection of feeling as a tool but point- 
ing out that these are procedures used by a 
warm person communicating with another, 
not tricks of the trade. Nondirective theory 
is revised here, for example, with the recog- 
nition that verbal structuring is of little 
value, that effective structuring is behavioral 


139 


rather than verbal. Chapter III deals with 
records in a manner that is refreshing among 
texts of this type: instead of discussing the 
construction of cumulative records, Tyler 
treats them as aids to counseling, as sources 
of hypotheses to explore in counseling, as a 
means of orientation to a client rather than 
as bases for diagnosis. She conceives of the 
counselor's province as being the client's feel- 
ings and attitudes, not objective facts, and 
she would leave these and the manipulation 
of the environment largely to other person- 
nel workers. 

The chapter on diagnosis therefore rec- 
ommends that counseling not be organized 
around this activity, as it typically is in non- 
Rogerian settings, but that diagnostic activi- 
ties be relied upon for initial screening and 
particularly as means of helping the client to 
understand himself. Data showing that clini- 
cal predictions are not valid, but that coun- 
seling with tests improves vocational decision- 
making are cited, and ways of helping clients 
use test results are discussed in a manner 
which effectively brings together the contribu- 
tions of nondirective and diagnostic counsel- 
ing. Chapters V and VI deal with tests, 
leaving data on the construction and valida- 
tion of specific tests to other textbooks, and 
concentrating on what tests can contribute to 
the self-understanding of the client and how 
the counselor can use them for this purpose. 
The generally admitted desirability of at least 
one nondirective interview before testing so 
that problems may be aired, the more de- 
batable advantage of testing by batteries in- 
stead of giving a single test when interview- 
ing brings up the need for that kind of fact 
and other tests as other facts are needed, and 
the equally debatable value (in this reviewer's 
opinion) of written reports for clients, are 
brought out. 

The chapter on occupational information 
also stresses the use of such information in 
counseling, although brief attention is paid 
to sources in passing. Thus the distinctive 
emphasis of this text is maintained, relying 
on standard texts for information on sources 
and tools and concentrating on how the coun- 
selor uses them in counseling. The stress on 
occupational information which characterized 
early vocational guidance, the later rejection 
of this method by some in favor of testing 








140 


and still later by others in favor of counsel- 
ing concerning attitudes, are placed in nice 
perspective (although some details of histori- 
cal explanation are incorrect as in the failure 
to recognize that early writers such as Par- 
sons also advocated self-understanding and 
counseling), and a synthesis of these ap- 
proaches and methods such as that which 
characterizes much of the best contemporary 
counseling is achieved. Occupational infor- 
mation is seen as a means of reality testing. 

Chapter VIII deals with psychotherapy, 
and Chapter IX with decision-making inter- 
views, thereby putting this text practically in 
a class by itself for comprehensiveness and 
balance in coverage. Tyler stresses the unity 
of the person and hence of counseling, laments 
false distinctions between personal and voca- 
tional counseling (still incorrectly attributed 
to the Veterans Administration), argues in 
favor of counseling which deals with voca- 
tional choice as part of the development of 
the person, and at the same time recognizes 
that people do have to make occupational de- 
cisions. In dealing with psychotherapy she 
stresses the importance of the relationship, 
and makes the nice point that reflection of 


feeling is not so much a technique of treat- 
ment as a means of conveying to the client 


that communication is taking place. Tyler 
is appropriately modest concerning our knowl- 
edge of psychotherapy, and points out issues 
concerning which we lack information. The 
analysis of the processes of decision-making 
and of counseling in this connection is origi- 
nal and helpful. 

In Chapter IX the school counselor is 
placed in the context of the school as one per- 
sonnel worker, with the peculiar function of 
trying not to decide things for the student. 
This is a helpful distinction between counsel- 
ing and administrative functions, but not one 
which fits the school counselor’s job as struc- 
tured in most schools, where the counselor is 
also expected to handle discipline, program- 
ming, and a variety of decision-forcing, as 
contrasted with facilitating, functions. The 
use of community resources and agencies by 
the counselor is discussed, but not in any 
detail. 

A chapter on the selection and training of 
counselors, and one on evaluation, bring the 


Book Reviews 


book to a close. The former mentions vari- 
ous professional associations, but makes no 
mention of the American Personnel and Guid- 
ance Association as that which, in 1950, re- 
sulted from the unification of all but one of 
those listed, follows the style of the Michi- 
gan Conference in referring to counselor-psy- 
chologists instead of the more recent officially 
adopted APA term of counseling psycholo- 
gists, and fails to recognize that many coun- 
selors and counseling psychologists are em- 
ployed in community agencies, hospitals, and 
industrial or business concerns. It is other- 
wise up-to-date and helpful, particularly in 
its discussion of the self-selective functioning 
of a good counselor-training program pro- 
vided there has been initial screening for 
academic ability. 

The final chapter is an excellent critical 
review of evaluative studies, except for the 


curious failure to note the inadequacy of 


Latham’s study which results from its at- 
tempt to relate test scores to occupational 
success after one year of work (shown by 
career pattern research and longer-term fol- 
low-ups to be too brief and early a period to 
be meaningful), the even stranger omission 
of Strong’s studies on the occupational pre- 
dictive value of tests, and the final erroneous 
conclusion which Tyler therefore reaches, to 
the effect that tests have no predictive value 
for occupational success and satisfaction. 

Three appendices include an intake form, 
notes on some interviews, and selected read- 
ings. The first two are not coordinated with 
the text and hence have little value beyond 
what the reader can derive from examining 
them himself; the last contains a number of 
helpful references, but excludes all treatises 
on counseling other than the nondirective 
(e.g., Robinson, Hahn and McLean, William- 
son), surely a mistake in a text which does 
as good a job of synthesizing viewpoints as 
does this. 

A few criticisms of details and a few ma- 
jor weaknesses should be mentioned, before 
reaching an over-all evaluation. 

The apparent desire to write a -smoothly 
reading, easily digested text occasionally re- 
sults in less specificity of facts than is desir- 
able, as in the failure to mention the GATB 
as the USES test under discussion on page 








Book Reviews 141 


130. It results, furthermore, in slighting the 
originators of ideas, for while Roethlisberger 
and Dickson are mentioned as having devel- 
oped a nondirective approach simultaneously 
with Rogers (but in 1939 rather than 1937 
as stated), Otto Rank and Jessie Taft are not 
mentioned as important precursors. Many 
other points made and ideas expressed in the 
text appear as though they were Tyler’s, with 
no indication of when they are original, when 
they are a part of the thinking of contempo- 
rary counseling psychologists, or when they 
are novel ideas first expressed by other psy- 
chologists in the literature on counseling. 
The contributions of others to Tyler's think- 
ing get recognition only if they are research 
contributions, for the only theoretical con- 
tributions acknowledged are the two non- 
directive sources mentioned above and Bordin 
and Bixler’s article on test selection by clients, 
although others are clearly traceable in Ty- 
ler’s writing. Finally, the book should have 


a subtitle, “In Educational Settings,’ for as 
pointed out above it is written in terms pri- 
marily of the counselor in a college or uni- 
versity, and to a lesser extent the high school 
counselor, and disregards the fact that many 


counselors work in sogial agency, medical, 
and industrial settings. This limitation does 
not lessen the value of the book for theory 
or technique, but it does make its discussion 
of some operating problems less valuable than 
it might be to these other counselors. 

In this reviewer's judgment, this book is 
the first genuine attempt at a synthesis of 
what we know about counseling. Its prede- 
cessors have described approaches developed 
by individuals or groups of psychologists 
working together in one setting, anc hence 
have been biased by the theoretical predilec- 
tions and experimental limitations of the con- 
tributors. Tyler has, as pointed out above, 
a theoretical bias in favor of nondirective 
counseling, one which caused her inadequately 
to summarize and evaluate the research on 
the occupational predictive value of tests. 
She has apparently had limited experience in 
other than educational settings, which results 
in the slighting of community resources and 
work in other settings. But she has drawn on 
research and theory regardless of school and 
has critically examined her own work, and 


has thus achieved a breadth of viewpoint, 
variety of technique, and comprehensiveness 
of scope which make her book unique. To 
put it in a nutshell, although it seems to this 
reviewer that the book shows an as yet in- 
complete recovery from the impact of the 
nondirectivists, it is an extremely valuable 
text which many of us active counseling psy- 
chologists would be glad to have written our- 
selves! 
Donald E. Super 
Teachers College, Columbia University 


Recommended practice for residence lighting. 
New York: Illuminating Engineering So- 
ciety, 1953. Pp. 44. $1.00. 

This pamphlet, prepared by the Committee 
on Residence Lighting of the /.£.S., contains 
information useful to architects and to ap- 
plied psychologists who are concerned with 
specifying illumination which will provide an 
attractive living space as well as comfortable 
and efficient vision in the home. Important 
developments in the field provide a basis for 
marked improvement over the first Recom- 
mended Practice of Home Lighting which ap- 
peared in 1945. The present pamphlet is 
concerned mainly with basic lighting require- 
ments for family activities which involve 
close vision. 

It is gratifying to find a strong emphasis 
placed upon those factors which promote com- 
fortable: vision. Among these are: (1) a 
coordination of decoration (painting) with 
lighting to achieve satisfactory distribution 
of illumination; (2) the maintaining of satis- 
factory brightness ratios in the field of view 
and the surroundings; (3) selection of light 
sources; and (4) lighting for specific visual 
tasks such at sewing, dining, etc. 

The numerous pictures and figures illus- 
trating types of fixtures and desirable ar- 
rangements of lighting for specific seeing 
tasks are well chosen. Limitations as well as 
uses are incorporated into much of the dis- 
cussion. Helpful materials are given in the 
appendix: detailed description of typical in- 
candescent lamps and of fluorescent tubes, 
luminaire classification, lighting maintenance, 
and glossary of technical terms. 

There have been two rather marked in- 
creases in the light intensities recommended 








142 


in 1953 in comparison with those recom- 
mended in 1945: For sewing dark fabrics, 150 
from 100 footcandles; average sewing, 80 
from 40 footcandles. In several instances the 
recommended intensities are higher than can 
be justified by research findings. Except 
where casual seeing is involved, the tendency 
is to recommend at least 40 footcandles. 

This is, in general, an excellent pamphlet 
on home lighting. The careful reader with 
a knowledge of the field can approve of all 
the material except the recommended light 
intensities. 

Miles A. Tinker 


University of Minnesota 


Bullock, Robert P. Social factors related to 
job satisfaction, a technique for the meas- 
urement of job satisfaction. Research 
Monograph Number 70. Columbus, Ohio: 
Bureau of Business Research, 1952. Pp. 
105. $2.00. 

This monograph is the report of a research 
study designed to discover the relationship of 
certain social factors to job-satisfaction and 
to employ these factors in a scale for the 
measurement of job-satisfaction. The basic 
assumption underlying the study is that the 
individual’s work behavior and adjustment 
depend upon his sentiments and attitudes. 
It is further assumed that these sentiments 
and attitudes are a result of his attempt to 
achieve personal adjustment within at least 
three separate, interacting social systems: 
the informal work group, the formal work 
organization and the larger social commu- 
nity within which the employing industry is 
located. In this study, job-satisfaction is 
considered to be an attitude resulting “from 
a balancing and summation of many specific 
likes and dislikes experienced in connection 
with the job.” 

Two measuring instruments were prepared, 
a Job-Satisfaction Scale for use as a criterion 
and a Social-Factor Questionnaire. The Job- 
Satisfaction Scale was of the multiple answer 
type patterned closely after the Hoppock 
scales. The Social-Factor Questionnaire «on- 
sisted of 129 items designed to inventory con- 
ditions on the job, in the home, in the com- 
munity and attitudes of the worker. Seventy- 
five of these were in Y, ?, N format, thirty 


Book Reviews 


were Agree, ?, Disagree items. ‘Twenty-four 
were multiple answer questions sampling per- 
sonal background information. (All are pre- 
sented for the reader’s examination in appen- 
dices to the monograph. ) 

The instruments were pre-tested on a group 
of 53 male juniors and seniors in college, all 
of whom had held full time jobs. Validation 
was accomplished on this group and on two 
samples from an animal registration associa- 
tion. One hundred currently employed per- 
sons comprised the first sample and 124 ex- 
employees the second. The Job-Satisfaction 
Scale was checked by testing its ability to 
differentiate between groups judged “satis- 
fied” and those judged “dissatisfied” (judg- 
ments made by a panel on the basis of per- 
sonnel data), between individuals who gave 
“satisfied” answers to three factual questions 
from those who did not and between current 
and ex-employees. This last differentiation 
was required on the assumption that dissatis- 
faction might be more intense and more fre- 
quently associated with termination of em- 
ployment. 

To assess the validities of the Social Fac- 
tor Questionriaire items, individuals in all 
three samples were divided into extreme 
groups on the basis of Job-Satisfaction Scale 
scores. Each item was then evaluated in 
terms of the CR of the difference between 
“Satisfied” group responses and ‘“Dissatis- 
fied” group responses. Objections to the in- 
stability of CR in small samples were met by 
requiring high CR’s in all three samples. 

The author deserves commendation for his 
adaptation of the Social Factor Questionnaire 
to the measurement of job-satisfaction and 
for his attempt to validate his instruments. 
All too frequently measures in this area are 
offered to the public without any systematic 
attempt at validation. Further research, 
however, is necessary before these results are 
generalized to other industries. The popula- 
tion utilized was probably well chosen for the 
purposes of a prototype study. However, the 
unusualness of the occupation, the small num- 
bers involved and the fact that its workers 
were non-union suggest the need for cross- 
validation. 

Howard L. Roy 


Personnel Research Branch, 
TAGO, Department of the Army 








New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, 
Editor, Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota. 


Class, status and power. Reinhard Bendix 
and Seymour Martin Lipset, Editors. Glen- 
coe, Ill.: The Free Press, 1953. Pp. 732. 
$7.50. , 

Science and man’s behavior. 
row. New York: Philosophical Library, 
1953. Pp. 564. $6.00. 

The teaching-learning process. 
Cantor. New York: The Dryden Press, 
1953. Pp. 350. $2.90. 

Fundamental psychiatry. John R. Cavanagh 
and James B. McGoldrick. Milwaukee: 
The Bruce Publishing Company, 1953. 
Pp. 582. $5.50. 

The transfer value of guided learning. Rob- 
ert C. Craig. New York: Bureau of Publi- 
cations, Teachers College, Columbia Uni- 
versity, 1953. Pp. 85. $2.75. 

The role of growth hormone in carbohydrate 
metabolism. R. C. De Bodo and M. W. 
Sinkoff. New York: The New York Acad- 
emy of Sciences, 1953. Pp. 38. $1.00. 

The sales department looks at costs. M. J. 
Dooher, Editor. New York: American 
Management Association, 1953. Pp. 30. 
$1.25. 

The American sexual tragedy. Albert Ellis. 
New York: Twayne Publishers, 1954. Pp. 
288. $4.50. 

Self-perception in the university. Edgar Z. 
Friedenberg and Julius A. Roth. Chicago: 
The University of Chicago Press, 1953. 
Pp. 102. $1.75. 

The human senses. Frank A. Geldard. 
York: John Wiley & Sons, 1953. 
$5.00. 

Functional motor efficiency of the eyes and 
its relation to reading. Luther C. Gilbert. 
Berkeley: University of California Press, 
1953. Pp. 231. $1.00. 

A clinical approach to children’s Rorschachs. 
Florence Halpern. New York: Grune & 

. Stratton, Inc., 1953. Pp. 270. $6.00. 

Mechanism of corticosteroid action in disease 
processes. Oscar Hechter. New York: 
The New York Academy of Sciences, 1953. 
Pp. 192. $3.50. 


Trigant Bur- 


Nathaniel 


New 
Pp. 365. 


143 


Introduction to psychology. Ernest R. Hil- 
gard. New York: Harcourt, Brace and 
Company, Inc. Text Edition, $5.75. Stu- 
dent Guide and Workbook, $1.50. 

Sex ethics and the Kinsey reports. 
Hiltner. New York: 
1953. Pp. 238. $3.00. 

Religion, science and human crises. 
L. K. Hsu. New York: Grove Press, 1952. 
Pp. 142. $3.50. 

Two essays on analytical psychology. C. G. 
Jung. New York: Bollingen Foundation, 
Inc., 1953. Pp. 329. $3.75. 

A speculation in reality. Irving F. Lauks. 
New York: Philosophical Library, 1953. 
Pp. 154. $3.75. 

Adolescence. Marguerite Malm and Olis G. 
Jamison. New York: McGraw-Hill Book 
Company, Inc., 1953. Pp. 512. $5.00. 

Men and unions. John G. Mapes. New 
York: Group Attitudes Corporation, 500 
Fifth Avenue, 1953. Pp. 36. $1.00. 

The achievement motive. David C. McClel- 
land, John W. Atkinson, Russell A. Clark, 
and Edgar L. Lowell. New York: Apple- 
ton-Century-Crofts, Inc., 1953. Pp. 424. 
$6.00. 

And lo, the star. Margaret Aikins McGarr. 
New York: Pageant Press, 1953. Pp. 116. 
$2.50. 

Mental health in the home. 
geon McLeod. New York: Twayne Pub- 
lishers, 1953. Pp. 243. $3.50. 

Techniques of living. William H. Mikesell. 
Harrisburg, Pa.: The Stackpole Company, 
1953. Pp. 338. $3.95. 

Psychoanalysis and personality. Joseph Nut- 
tin. New York: Sheed and Ward, Inc., 
1953. $4.00. 

Method and theory in experimental psychol- 
ogy. Charles E. Osgood. New York: Ox- 
ford University Press, 1953. Pp. 976. 
$10.00. 

Education and society. A. K. C. Ottaway. 
New York: Grove Press, 1954. Pp. 182. 


Seward 
Associated Press, 


Francis 


Laurence Spur- 








144 


Personality and adjustment. William L. 
Patty and Louise Snyder Johnson. New 
York: McGraw-Hill Book Company, Inc., 
1953. Pp. 403. $4.75. 

Child psychology. Leigh Peck. 
C. Heath and Company, 1953. 
$5.25. 

Conciliation in action. Edward Peters. New 
London, Conn.: National Foremen’s Insti- 
tute, Inc., 1953. $4.50. 

The child’s conception of number. Jean 
Piaget. New York: The Humanities Press, 
Inc., 1953. Pp. 248. $5.00. 

Shame and guilt. Gerhart Piers and Milton 
B. Singer. Springfield, Hl.: Charles C 
Thomas, Publisher, 1953. Pp. 86. $3.25. 

Adrenal cortex. Elaine P. Ralli, Editor. 
New York: Josiah Macy, Jr. Foundation, 
1953. Pp. 165. $4.00. 

Existential psychoanalysis. Jean-Paul Sartre. 
New York: Philosophical Library, 1953. 
Pp. 275. $4.75. 

The adolescent: A book of readings. Jerome 
M. Seidman, Editor. New York: The 
Dryden Press, 1953. Pp. 798. $4.50. 

Know your doctor. Leo Smollar and Neil 
Morgan. Boston: Little, Brown & Com- 
pany, 1954. Pp. 173. $3.00. 

Father relations of war-born children. 
Meek Stolz. Stanford, Calif.: 
University Press, 1954. Pp. 365. 

Saving children from delinquency. 
Stott. New York: Philosophical Library, 
1953. Pp. 266. $4.75. 

Handwriting: A personality projection. Frank 
Victor. Springfield, Ill.; Charles C Thomas, 
Publisher, 1953. Pp. 168. $1.75. 

The psychology of thinking. W. Edgar 
Vinacke. New York: McGraw-Hill Book 
Co., Inc. Pp. 370. $6.00. 


Boston: D. 
Pp. 536. 


Lois 
Stanford 
$4.00. 
D. H. 


New Books, Monographs, and Pamphlets 


Cybernetics. Heinz Von Foerster, Editor. 
New York: Josiah Macy, Jr. Foundation, 
1953. Pp. 184. $4.00. 

Hypnotism: An objective study in suggesti- 
bility. André M. Weitzenhoffer. New 
York: John Wiley & Sons, Inc., 1953. Pp. 
380. $6.00. 

An introduction to scientific research. E. 
Bright Wilson, Jr. New York: McGraw- 
Hill Book Company, Inc., 1953. Pp. 375. 
$6.00. 

Driver characteristics and accidents. High- 
way Research Board, Washington, D. C.: 
National Academy of Sciences—National 
Research Council, 1953. Pp. 54. $.90. 

Report of highway safety research correlation 
conferences. Committee on Highway Safety 
Research. Washington, D. C.: National 
Academy of Sciences—National Research 
Council, 1952. Pp. 63. 

The field of highway safety research. Com- 
mittee on Highway Safety Research. Wash- 
ington, D. C.: National Academy of Sci- 
ences—National Research Council, 1952. 
Pp. 42. 

1.E.S. recommended practice for residence 
lighting. 1.E.S. Committee on Residence 
Lighting. New York: Publications Office, 
Illuminating Engineering Society, 1860 
Broadway. Pp. 44. $1.00. 

The Social Welfare Forum. National Con- 
ference of Social Work. New York: Co- 
lumbia University Press, 1953. Pp. 365. 
$5.00. 

Group report of a program of research in 
psychotherapy. Psychotherapy Research 
Group, The Pennsylvania . State College. 
State College, Pa.: William U. Snyder, De- 
partment of Psychology. Pp. 179. $2.25. 





New McGraw-Hill Books 


PSYCHOLOGY APPLIED TO HUMAN AFFAIRS. New second edition 


By J. Stantey Gray, University of Georgia. McGraw-Hill Series in Psy- 
chology. 581 pages, $6.00 


A revision of Psychology in Human Affairs, this text remains essentially the same 
in purpose and tone, but is completely modernized. An introduction to application 
of psychology in more than twenty fields, each chapter presents uses an 


data on a specific field. The book also includes methodology of research in these 
various fields. 


PERSONNEL MANAGEMENT. New fifth edition 


By Watter Dit Scott, Northwestern University; Ropert C. CLorHier, 
Rutgers oe and Wituiam R. Spriecet, University of Texas. 685 
pages, $6.50 


This comprehensive new revision eliminates outdated material and brings the text 
completely up to date. A new chapter on “Personnel Management as a Coordinat- 
ing Function” has been added to complete this outstanding outline of principle, 
practices, and instruments in the important relationships between management and 
workers. 


Correlated Text-Films and Filmstrips—Five 16 mm. sound motion pictures and five 
follow-up filmstrips. In preparation 


PSYCHOLOGY 


By Witt1am J. Pitt, Brooklyn College; and Jacos A. Gotpsere, Director, 
Social Hygiene Division, New York Tuberculosis and Health Association and 
Lecturer, New York University College of Medicine. In press 


Here is a brief, generously-illustrated text which grew out of and is designed for 
an elementary psychology course offered in general education, adult education, and 
extension programs. Its orientation is social and personal, and its aim is to help 
the individual understand and adjust to his role in the home, the school, the com- 
munity, and the job. Its orientation is practical; its psychological content sound. 


SO THIS IS COLLEGE 


By Paut H. Lanois, The State College of Washington. 205 pages, $3.75 
(text edition available) 


A sympathetic treatment of the adjustments which the student must make when 
he moves from high school to college, with its intensified social and scholastic com- 
petition. Written in a warm, lively fashion, it is based on the experiences of more 
than a thousand students. 


SS A A A SS A SS A a eS SS 





copies on 





McGraw-Hill Book Company, Inc. | "07" 





330 West 42nd Street New York 36, N. Y. | 2PProval 




















Ready now . . . the revised Second Edition of the internationally famous. . . 


MANUAL OF CHILD PSYCHOLOGY 


Edited by Leonard Carmichael, Smithsonian Institution. 


This comprehensive manual is the 
work of 22 leading scientists in the fields 
of ethnology, anthropology, education, 
psychology, and medicine. In one volume 
they have combined a major portion of 
the relevant data and researches in child 
psychology. The second edition of their 
well-known work includes three com- 

letely new chapters—The Adolescent, 

sychopathology of Childhood, and So- 
cial Development in the Child—and 
has been revised to incorporate findings 
of modern research. 


Manual of Child Psychology demon- 


strates that a large body of important 
and reliable facts concerning the details 
of human mental development can be 
obtained by the use of appropriate 
techniques. The conclusions reached 
under these conditions differ from the 
vague theories of the prescientific era. 
This book, packed with an amazing 
amount of established scientific facts, 
evidences the end of the speculative 
ar in child psychology. A Wiley 
ublication in Psychology, Herbert A. 
Langfeld, Advisory Editor. 


1954, 1295 pages. $12.00. 


NATIONALISM AND SOCIAL COMMUNICATION 
By Karl W. Deutsch, The Massachusetts Institute of Technology. A 


Technology Press book, M. I. T. 


1953. 292 pages. $6.00. 


JOHN WILEY & SONS, Inc. 


Send for on-approval copies. 


440-4th Ave., 








New York 16, N.Y. 


